Gene edited soybean plants with enhanced traits

ABSTRACT

The invention relates to novel soybean plants, seeds and compositions, as well as improvements to plant breeding and methods for creating modifications in plant genomes.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. patent application Ser. No.16/261,243, filed Jan. 29, 2019, which claims benefit of priority toU.S. Provisional Patent Application No. 62/623,485, filed Jan. 29, 2018,which are incorporated herein by reference in their entireties.

SEQUENCE LISTING XML

The instant application contains a sequence listing, which has beensubmitted in XML file format by electronic submission and is herebyincorporated by reference in its entirety. The XML file, created on Apr.18, 2023, is named P13468US02.xml and is 1,925,122 bytes in size.

FIELD OF THE INVENTION

Disclosed herein are novel soybean plant cells, soybean plants andsoybean seeds derived from such plant cells and having enhanced traits,and methods of making and using such plant cells and derived plants andseeds.

BACKGROUND

Plant breeding and engineering currently relies on Mendelian genetics orrecombinant techniques.

SUMMARY

Disclosed herein are methods for providing novel soybean plant cells orsoybean plant protoplasts, plant callus, tissues or parts, whole soybeanplants, and soybean seeds having one or more altered genetic sequences.Among other features, the methods and compositions described hereinenable the stacking of preferred alleles without introducing unwantedgenetic or epigenetic variation in the modified plants or plant cells.The efficiency and reliability of these targeted modification methodsare significantly improved relative to traditional plant breeding, andcan be used not only to augment traditional breeding techniques but alsoas a substitute for them.

In one aspect, the invention provides a method of changing expression ofa sequence of interest in a genome, including integrating a sequenceencoded by a polynucleotide, such as a double-stranded orsingle-stranded polynucleotides including DNA, RNA, or a combination ofDNA and RNA, at the site of at least one double-strand break (DSB) in agenome, which can be the genome of a eukaryotic nucleus (e. g., thenuclear genome of a plant cell) or a genome of an organelle (e. g., amitochondrion or a plastid in a plant cell). Effector molecules forsite-specific introduction of a DSB into a genome include variousendonucleases (e. g., RNA-guided nucleases such as a type II Casnuclease, a Cas9, a type V Cas nuclease, a Cpf1, a CasY, a CasX, a C2c1,or a C2c3) and guide RNAs that direct cleavage by an RNA-guidednuclease. Embodiments include those where the DSB is introduced into agenome by a ribonucleoprotein complex containing both a site-specificnuclease (e. g., Cas9, Cpf1, CasX, CasY, C2c1, C2c3) and at least oneguide RNA, or by a site-specific nuclease in combination with at leastone guide RNA; in some of these embodiments no plasmid or otherexpression vector is utilized to provide the nuclease, the guide RNA, orthe polynucleotide. These effector molecules are delivered to the cellor organelle wherein the DSB is to be introduced by the use of one ormore suitable composition or treatment, such as at least one chemical,enzymatic, or physical agent, or application of heat or cold,ultrasonication, centrifugation, electroporation, particle bombardment,and bacterially mediated transformation. It is generally desirable thatthe DSB is induced at high efficiency. One measure of efficiency is thepercentage or fraction of the population of cells that have been treatedwith a DSB-inducing agent and in which the DSB is successfullyintroduced at the correct site in the genome. The efficiency of genomeediting is assessed by any suitable method such as a heteroduplexcleavage assay or by sequencing, as described elsewhere in thisdisclosure. In various embodiments, the DSB is introduced at acomparatively high efficiency, e. g., at about 20, about 30, about 40,about 50, about 60, about 70, or about 80 percent efficiency, or atgreater than 80, 85, 90, or 95 percent efficiency. In embodiments, theDSB is introduced upstream of, downstream of, or within the sequence ofinterest, which is coding, non-coding, or a combination of coding andnon-coding sequence. In embodiments, a sequence encoded by thepolynucleotide (such as a double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, or a double-stranded DNA/RNA hybrid),when integrated into the site of the DSB in the genome, is thenfunctionally or operably linked to the sequence of interest, e. g.,linked in a manner that modifies the transcription or the translation ofthe sequence of interest or that modifies the stability of a transcriptincluding that of the sequence of interest. Embodiments include thosewhere two or more DSBs are introduced into a genome, and wherein asequence encoded by a polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) that is integrated into each DSB is thesame or different for each of the DSBs. In embodiments, at least twoDSBs are introduced into a genome by one or more nucleases in such a waythat genomic sequence (coding, non-coding, or a combination of codingand non-coding sequence) is deleted between the DSBs (leaving a deletionwith blunt ends, overhangs or a combination of a blunt end and anoverhang), and a sequence encoded by a polynucleotide (such as adouble-stranded DNA, a single-stranded DNA, a single-stranded DNA/RNAhybrid, or a double-stranded DNA/RNA hybrid) molecule is integratedbetween the DSBs (i. e., at the location of the deleted genomicsequence). The method is particularly useful for integrating into thesite of a DSB a heterologous nucleotide sequence that provides a usefulfunction or use. For example, the method is useful for integrating orintroducing into the genome a heterologous sequence that stops or knocksout expression of a sequence of interest (such as a gene encoding aprotein), or a heterologous sequence that is a unique identifiernucleotide sequence, or a heterologous sequence that is (or thatencodes) a sequence recognizable by a specific binding agent or thatbinds to a specific molecule, or a heterologous sequence that stabilizesor destabilizes a transcript containing it. Embodiments include use ofthe method to integrate or introduce into a genome sequence of apromoter or promoter-like element (e. g., sequence of an auxin-bindingor hormone-binding or transcription-factor-binding element, or sequenceof or encoding an aptamer or riboswitch), or a sequence-specific bindingor cleavage site sequence (e. g., sequence of or encoding anendonuclease cleavage site, a small RNA recognition site, a recombinasesite, a splice site, or a transposon recognition site). In embodiments,the method is used to delete or otherwise modify to make non-functionalan endogenous functional sequence, such as a hormone- ortranscription-factor-binding element, or a small RNA or recombinase ortransposon recognition site. In embodiments, additional molecules areused to effect a desired expression result or a desired genomic change.For example, the method is used to integrate heterologous recombinaserecognition site sequences at two DSBs in a genome, and the appropriaterecombinase molecule is employed to excise genomic sequence locatedbetween the recombinase recognition sites. In another example, themethod is used to integrate a polynucleotide-encoded heterologous smallRNA recognition site sequence at a DSB in a sequence of interest in agenome, wherein when the small RNA is present (e. g., expressedendogenously or transiently or transgenically), the small RNA binds toand cleaves the transcript of the sequence of interest that contains theintegrated small RNA recognition site. In another example, the method isused to integrate in the genome of a soybean plant or plant cell apolynucleotide-encoded promoter or promoter-like element that isresponsive to a specific molecule (e. g., an auxin, a hormone, a drug,an herbicide, or a polypeptide), wherein a specific level of expressionof the sequence of interest is obtained by providing the correspondingspecific molecule to the plant or plant cell; in a non-limiting example,an auxin-binding element is integrated into the promoter region of aprotein-coding sequence in the genome of a plant or plant cell, wherebythe expression of the protein is upregulated when the correspondingauxin is exogenously provided to the plant or plant cell (e. g., byadding the auxin to the medium of the plant cell or by spraying theauxin onto the plant). Another aspect of the invention is a soybean cellincluding in its genome a heterologous DNA sequence, wherein theheterologous sequence includes (a) nucleotide sequence of apolynucleotide integrated by the method at the site of a DSB in thegenome, and (b) genomic nucleotide sequence adjacent to the site of theDSB; related aspects include a plant containing such a cell including inits genome a heterologous DNA sequence, progeny seed or plants(including hybrid progeny seed or plants) of the plant, and processed orcommodity products derived from the plant or from progeny seed orplants. In another aspect, the invention provides a heterologousnucleotide sequence including (a) nucleotide sequence of apolynucleotide (such as a double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, or a double-stranded DNA/RNA hybrid)molecule integrated by the method at the site of a DSB in a genome, and(b) genomic nucleotide sequence adjacent to the site of the DSB; relatedaspects include larger polynucleotides such as a plasmid, vector, orchromosome including the heterologous nucleotide sequence, as well as apolymerase primer for amplification of the heterologous nucleotidesequence.

In another aspect, the invention provides a composition including aplant cell and a polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is capable of beingintegrated at (or having its sequence integrated at) a double-strandbreak in genomic sequence in the plant cell. In various embodiments, theplant cell is an isolated plant cell or plant protoplast, or is in amonocot plant or dicot plant, a zygotic or somatic embryo, seed, plantpart, or plant tissue. In embodiments the plant cell is capable ofdivision or differentiation. In embodiments the plant cell is haploid,diploid, or polyploid. In embodiments, the plant cell includes adouble-strand break (DSB) in its genome, at which DSB site thepolynucleotide donor molecule is integrated using methods disclosedherein. In embodiments, at least one DSB is induced in the plant cell'sgenome by including in the composition a DSB-inducing agent, forexample, various endonucleases (e. g., RNA-guided nucleases such as atype II Cas nuclease, a Cas9, a type V Cas nuclease, a Cpf1, a CasY, aCasX, a C2c1, or a C2c3) and guide RNAs that direct cleavage by anRNA-guided nuclease; the dsDNA molecule is integrated into the DSB thusinduced using methods disclose herein. Specific embodiments includecompositions including a plant cell, at least one dsDNA molecule, and atleast one ribonucleoprotein complex containing both a site-specificnuclease (e. g., Cas9, Cpf1, CasX, CasY, C2c1, C2c3) and at least oneguide RNA; in some of these embodiments, the composition contains noplasmid or other expression vector for providing the nuclease, the guideRNA, or the dsDNA. In embodiments of the composition, the polynucleotidedonor molecule is double-stranded DNA or RNA or a combination of DNA andRNA, and is blunt-ended, or contains one or more terminal overhangs, orcontains chemical modifications such as phosphorothioate bonds or adetectable label. In other embodiments, the polynucleotide donormolecule is a single-stranded polynucleotide composed of DNA or RNA or acombination of DNA or RNA, and can further be chemically modified orlabelled. In various embodiments of the composition, the polynucleotidedonor molecule includes a nucleotide sequence that provides a usefulfunction when integrated into the site of the DSB. For example, invarious non-limiting embodiments the polynucleotide donor moleculeincludes: sequence that is recognizable by a specific binding agent orthat binds to a specific molecule or encodes an RNA molecule or an aminoacid sequence that binds to a specific molecule, or sequence that isresponsive to a specific change in the physical environment or encodesan RNA molecule or an amino acid sequence that is responsive to aspecific change in the physical environment, or heterologous sequence,or sequence that serves to stop transcription or translation at the siteof the DSB, or sequence having secondary structure (e. g.,double-stranded stems or stem-loops) or than encodes a transcript havingsecondary structure (e. g., double-stranded RNA that is cleavable by aDicer-type ribonuclease). In particular embodiments, the modificationsto the soybean cell or plant will affect the activity or expression ofone or more genes or proteins listed in Table 5, and in some embodimentstwo or more of those genes or proteins. In related embodiments, theactivity or expression of one or more genes or proteins listed in Table10 will be altered by the introduction or creation of one or more of theregulatory sequences listed in Table 9.

In another aspect, the invention provides a reaction mixture including:(a) a soybean plant cell having at least one double-strand break (DSB)at a locus in its genome; and (b) a polynucleotide (such as adouble-stranded DNA, a single-stranded DNA, a single-stranded DNA/RNAhybrid, or a double-stranded DNA/RNA hybrid) donor molecule capable ofbeing integrated at (or having its sequence integrated at) the DSB(preferably by non-homologous end-joining (NHEJ)), wherein thepolynucleotide donor molecule has a length of between about 18 to about300 base-pairs (or nucleotides, if single-stranded), or between about 30to about 100 base-pairs (or nucleotides, if single-stranded); whereinthe polynucleotide donor molecule includes a sequence which, ifintegrated at the DSB, forms a heterologous insertion (wherein thesequence of the polynucleotide molecule is heterologous with respect tothe genomic sequence flanking the insertion site or DSB). In embodimentsof the reaction mixture, the plant cell is an isolated plant cell orplant protoplast. In various embodiments, the plant cell is an isolatedplant cell or plant protoplast, or is in a monocot plant or dicot plant,a zygotic or somatic embryo, seed, plant part, or plant tissue. Inembodiments the plant cell is capable of division or differentiation. Inembodiments the plant cell is haploid, diploid, or polyploid. Inembodiments of the reaction mixture, the polynucleotide donor moleculeincludes a nucleotide sequence that provides a useful function or usewhen integrated into the site of the DSB. For example, in variousnon-limiting embodiments the polynucleotide donor molecule includes:sequence that is recognizable by a specific binding agent or that bindsto a specific molecule or encodes an RNA molecule or an amino acidsequence that binds to a specific molecule, or sequence that isresponsive to a specific change in the physical environment or encodesan RNA molecule or an amino acid sequence that is responsive to aspecific change in the physical environment, or heterologous sequence,or sequence that serves to stop transcription or translation at the siteof the DSB, or sequence having secondary structure (e. g.,double-stranded stems or stem-loops) or than encodes a transcript havingsecondary structure (e. g., double-stranded RNA that is cleavable by aDicer-type ribonuclease).

In another aspect, the invention provides a polynucleotide fordisrupting gene expression, wherein the polynucleotide isdouble-stranded and includes at least 18 contiguous base-pairs andencoding at least one stop codon in each possible reading frame on eachstrand, or is single-stranded and includes at least 11 contiguousnucleotides; and wherein the polynucleotide encodes at least one stopcodon in each possible reading frame on each strand. In embodiments, thepolynucleotide is a double-stranded DNA (dsDNA) or a double-strandedDNA/RNA hybrid molecule including at least 18 contiguous base-pairs andencoding at least one stop codon in each possible reading frame oneither strand. In embodiments, the polynucleotide is a single-strandedDNA or a single-stranded DNA/RNA hybrid molecule including at least 11contiguous nucleotides and encoding at least one stop codon in eachpossible reading frame on the strand. Such a polynucleotide isespecially useful in methods disclosed herein, wherein, when a sequenceencoded by the polynucleotide is integrated or inserted into a genome atthe site of a DSB in a sequence of interest (such as a protein-codinggene), the sequence of the heterologously inserted polynucleotide servesto stop translation of the transcript containing the sequence ofinterest and the heterologously inserted polynucleotide sequence.Embodiments of the polynucleotide include those wherein thepolynucleotide includes one or more chemical modifications or labels, e.g., at least one phosphorothioate modification.

In another aspect, the invention provides a method of identifying thelocus of at least one double-stranded break (DSB) in genomic DNA in acell (such as a plant cell) including the genomic DNA, wherein themethod includes the steps of: (a) contacting the genomic DNA having aDSB with a polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule, wherein thepolynucleotide donor molecule is capable of being integrated (or havingits sequence integrated) at the DSB (preferably by non-homologousend-joining (NHEJ)) and has a length of between about 18 to about 300base-pairs (or nucleotides, if single-stranded), or between about 30 toabout 100 base-pairs (or nucleotides, if single-stranded); wherein asequence encoded by the polynucleotide donor molecule, if integrated atthe DSB, forms a heterologous insertion; and (b) using at least part ofthe sequence of the polynucleotide molecule as a target for PCR primersto allow amplification of DNA in the locus of the DSB. In a relatedaspect, the invention provides a method of identifying the locus ofdouble-stranded breaks (DSBs) in genomic DNA in a pool of cells (such asplant cells or plant protoplasts), wherein the pool of cells includescells having genomic DNA with a sequence encoded by a polynucleotidedonor molecule inserted at the locus of the double stranded breaks;wherein the polynucleotide donor molecule is capable of being integrated(or having its sequence integrated) at the DSB and has a length ofbetween about 18 to about 300 base-pairs (or nucleotides, ifsingle-stranded), or between about 30 to about 100 base-pairs (ornucleotides, if single-stranded); wherein a sequence encoded by thepolynucleotide donor molecule, if integrated at the DSB, forms aheterologous insertion; and wherein the sequence of the polynucleotidedonor molecule is used as a target for PCR primers to allowamplification of DNA in the region of the double-stranded breaks. Inembodiments, the pool of cells is a population of plant cells or plantprotoplasts, wherein at least some of the cells contain multiple ordifferent DSBs in the genome, each of which can be introduced into thegenome by a different guide RNA.

In another aspect, the invention provides a method of identifying thenucleotide sequence of a locus in the genome that is associated with aphenotype, the method including the steps of: (a) providing to apopulation of cells having the genome: (i) multiple different guide RNAs(gRNAs) to induce multiple different double strand breaks (DSBs) in thegenome, wherein each DSB is produced by an RNA-guided nuclease guided toa locus on the genome by one of the gRNAs, and (ii) polynucleotide (suchas double-stranded DNA, single-stranded DNA, single-stranded DNA/RNAhybrid, and double-stranded DNA/RNA hybrid) donor molecules having adefined nucleotide sequence, wherein the polynucleotide donor moleculesare capable of being integrated (or having their sequence integrated)into the DSBs by non-homologous end-joining (NHEJ); whereby when atleast a sequence encoded by some of the polynucleotide donor moleculesare inserted into at least some of the DSBs, a genetically heterogeneouspopulation of cells is produced; (b) selecting from the geneticallyheterogeneous population of cells a subset of cells that exhibit aphenotype of interest; (c) using a pool of PCR primers that bind to atleast part of the nucleotide sequence of the polynucleotide donormolecules to amplify from the subset of cells DNA from the locus of aDSB into which one of the polynucleotide donor molecules has beeninserted; and (d) sequencing the amplified DNA to identify the locusassociated with the phenotype of interest. In embodiments of the method,the gRNA is provided as a polynucleotide, or as a ribonucleoproteinincluding the gRNA and the RNA-guided nuclease. Related aspects includethe cells produced by the method and pluralities, arrays, andgenetically heterogeneous populations of such cells, as well as thesubset of cells in which the locus associated with the phenotype hasbeen identified, and callus, seedlings, plantlets, and plants and theirseeds, grown or regenerated from such cells.

In another aspect, the invention provides a method of modifying a plantcell by creating a plurality of targeted modifications in the genome ofthe plant cell, wherein the method comprises contacting the genome withone or more targeting agents, wherein the one or more agents comprise orencode predetermined peptide or nucleic acid sequences, wherein thepredetermined peptide or nucleic acid sequences bind preferentially ator near predetermined target sites within the plant genome, and whereinthe binding directs or facilitates the generation of the plurality oftargeted modifications within the genome; wherein the plurality oftargeted modifications occurs without an intervening step of separatelyidentifying an individual modification and without a step of separatelyselecting for the occurrence of an individual modification among theplurality of targeted modifications mediated by the targeting agents;and wherein the targeted modifications alter at least one trait of theplant cell, or at least one trait of a plant comprising the plant cell,or at least one trait of a plant grown from the plant cell, or result ina detectable phenotype in the modified plant cell; and wherein at leasttwo of the targeted modifications are insertions of predeterminedsequences encoded by one or more polynucleotide donor molecules, andwherein at least one of the polynucleotide donor molecules lackshomology to the genome sequences adjacent to the site of insertion. In arelated embodiment, at least one of the polynucleotide donor moleculesused in the method is a single stranded DNA molecule, a single strandedRNA molecule, a single stranded DNA-RNA hybrid molecule, or a duplexRNA-DNA molecule. In another related embodiment, wherein the modifiedplant cell of the method is a meristematic cell, embryonic cell, orgermline cell. In yet another related embodiment, the methods describedin this paragraph, when practiced repeatedly or on a pool of cells,result in an efficiency of at least 1%, e.g., at least 2%, 5%, 7%, 10%,15%, 20%, 25%, 30%, 35% or more, wherein said efficiency is determined,e.g., by dividing the number of successfully targeted cells by the totalnumber of cells targeted.

In another embodiment, the invention provides a method of modifying aplant cell by creating a plurality of targeted modifications in thegenome of the plant cell, comprising: contacting the genome with one ormore targeting agents, wherein the one or more agents comprise or encodepredetermined peptide or nucleic acid sequences, wherein thepredetermined peptide or nucleic acid sequences bind preferentially ator near predetermined target sites within the plant genome, and whereinthe binding directs or facilitates the generation of the plurality oftargeted modifications within the genome; wherein the plurality oftargeted modifications occurs without an intervening step of separatelyidentifying an individual modification and without a step of separatelyselecting for the occurrence of an individual modification among theplurality of targeted modifications mediated by the targeting agents;and wherein the targeted modifications improve at least one trait of theplant cell, or at least one trait of a plant comprising the plant cell,or at least one trait of a plant grown from the plant cell, or result ina detectable phenotype in the modified plant cell; and wherein at leastone of the targeted modifications is an insertion of a predeterminedsequence encoded by one or more polynucleotide donor molecules, andwherein at least one of the polynucleotide donor molecules is a singlestranded DNA molecule, a single stranded RNA molecule, a single strandedDNA-RNA hybrid molecule, or a duplex RNA-DNA molecule. In a relatedembodiment, at least one of the polynucleotide donor molecules used inthe method lacks homology to the genome sequences adjacent to the siteof insertion. In another related embodiment, the modified plant cell isa meristematic cell, embryonic cell, or germline cell. In yet anotherrelated embodiment, repetition of the methods described in thisparagraph result in an efficiency of at least 1%, e.g., at least 2%, 5%,7%, 10%, 15%, 20%, 25%, 30%, 35% or more, wherein said efficiency isdetermined by dividing the number of successfully targeted cells by thetotal number of cells targeted. In a related embodiment, the targetedplant cell has a ploidy of 2n, with n being a value selected from thegroup consisting of 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4, 5, and 6, whereinthe method generates 2n targeted modifications at 2n loci of thepredetermined target sites within the plant cell genome; and wherein 2nof the targeted modifications are insertions or creations ofpredetermined sequences encoded by one or more polynucleotide donormolecules.

In another embodiment, the invention provides a method of modifying aplant cell by creating a plurality of targeted modifications in thegenome of the plant cell, comprising: contacting the genome with one ormore targeting agents, wherein the one or more agents comprise or encodepredetermined peptide or nucleic acid sequences, wherein thepredetermined peptide or nucleic acid sequences bind preferentially ator near predetermined target sites within the plant genome, and whereinthe binding directs the generation of the plurality of targetedmodifications within the genome; wherein the plurality of modificationsoccurs without an intervening step of separately identifying anindividual modification and without a step of separately selecting forthe occurrence of an individual modification among the plurality oftargeted modifications mediated by the targeting agents; wherein thetargeted modifications improve at least one trait of the plant cell, orat least one trait of a plant comprising the plant cell, or at least onetrait of a plant or seed obtained from the plant cell, or result in adetectable phenotype in the modified plant cell; and wherein themodified plant cell is a meristematic cell, embryonic cell, or germlinecell. In a related embodiment, at least one of the targetedmodifications is an insertion of a predetermined sequence encoded by oneor more polynucleotide donor molecules, and wherein at least one of thepolynucleotide donor molecules is a single stranded DNA molecule, asingle stranded RNA molecule, a single stranded DNA-RNA hybrid molecule,or a duplex RNA-DNA molecule. In yet another related embodiment, atleast one of the polynucleotide donor molecules lacks homology to thegenome sequences adjacent to the site of insertion. In yet anotherembodiment related to the methods of this paragraph, repetition of themethod results in an efficiency of at least 1%, e.g., at least 2%, 5%,7%, 10%, 15%, 20%, 25%, 30%, 35% or more, wherein said efficiency isdetermined by dividing the number of successfully targeted cells by thetotal number of cells targeted.

In another embodiment, the invention provides a method of modifying asoybean plant cell by creating a plurality of targeted modifications inthe genome of the plant cell, comprising: contacting the genome with oneor more targeting agents, wherein the one or more agents comprise orencode predetermined peptide or nucleic acid sequences, wherein thepredetermined peptide or nucleic acid sequences bind preferentially ator near predetermined target sites within the plant genome, and whereinthe binding directs the generation of the plurality of targetedmodifications within the genome; wherein the plurality of modificationsoccurs without an intervening step of separately identifying anindividual modification and without a step of separately selecting forthe occurrence of an individual modification among the plurality oftargeted modifications mediated by the targeting agents; and wherein thetargeted modifications improve at least one trait of the plant cell, orat least one trait of a plant comprising the plant cell, or at least onetrait of a plant or seed obtained from the plant cell, or result in adetectable phenotype in the modified plant cell; and wherein repetitionof the aforementioned steps results in an efficiency of at least 1%,e.g., at least 2%, 5%, 7%, 10%, 15%, 20%, 25%, 30%, 35% or more, whereinsaid efficiency is determined by dividing the number of successfullytargeted cells by the total number of cells targeted. In a relatedembodiment, the modified plant cell is a meristematic cell, embryoniccell, or germline cell. In another related embodiment, at least one ofthe targeted modifications is an insertion of a predetermined sequenceencoded by one or more polynucleotide donor molecules, and wherein atleast one of the polynucleotide donor molecules is a single stranded DNAmolecule, a single stranded RNA molecule, a single stranded DNA-RNAhybrid molecule, or a duplex RNA-DNA molecule. In yet another relatedembodiment of the methods of this paragraph, at least one of thepolynucleotide donor molecules used in the method lacks homology to thegenome sequences adjacent to the site of insertion.

In various embodiments of the methods described above, at least one ofthe targeted modifications is an insertion between 3 and 400 nucleotidesin length, between 10 and 350 nucleotides in length, between 18 and 350nucleotides in length, between 18 and 200 nucleotides in length, between10 and 150 nucleotides in length, or between 11 and 100 nucleotides inlength. In certain, embodiments, two of the targeted modifications areinsertions between 10 and 350 nucleotides in length, between 18 and 350nucleotides in length, between 18 and 200 nucleotides in length, between10 and 150 nucleotides in length, or between 11 and 100 nucleotides inlength.

In another variation of the methods described above, at least twoinsertions are made, and at least one of the insertions is anupregulatory sequence. In yet another variation, the targetedmodification methods described above insert or create at least onetranscription factor binding site. In yet another variation of themethods described above, the insertion or insertions of predeterminedsequences into the plant genome are accompanied by the deletion ofsequences from the plant genome.

In yet another embodiment of the targeted modification methods describedabove, the methods further comprise obtaining a plant from the modifiedplant cell and breeding the plant. In yet another embodiment, themethods described above comprise a step of introducing additionalgenetic or epigenetic changes into the modified plant cell or into aplant grown from the modified plant cell.

In an embodiment of the targeted modification methods described above,at least two targeted insertions are made and the targeted insertionsindependently up- or down-regulate the expression of two or moredistinct genes. For example, a targeted insertion may increaseexpression at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%, 60%, 70%,80%, 90%, 95%, 100% or greater, e.g., at least a 2-fold, 5-fold,10-fold, 20-fold, 30-fold, 40-fold, 50-fold change, 100-fold or even1000-fold change or more. In some embodiments, expression is increasedbetween 10-100%; between 2-fold and 5-fold; between 2 and 10-fold;between 10-fold and 50-fold; between 10-fold and a 100-fold; between100-fold and 1000-fold; between 1000-fold and 5,000-fold; between5,000-fold and 10,000 fold. In some embodiments, a targeted insertionmay decrease expression by at least 1%, 5%, 10%, 15%, 20%, 25%, 30%,35%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% or more.

In yet another embodiment of the targeted insertion methods describedabove, the donor polynucleotide is tethered to a crRNA by a covalentbond, a non-covalent bond, or a combination of covalent and non-covalentbonds. In a related embodiment, the invention provides a composition fortargeting a genome comprising a donor polynucleotide tethered to a cRNAby a covalent bond, a non-covalent bond, or a combination of covalentand non-covalent bonds.

In another embodiment of the targeted modification methods describedabove, the loss of epigenetic marks after modifying occurs in less than0.1%, 0.08%, 0.05%, 0.02%, or 0.01% of the genome. In yet anotherembodiment of the targeted modification methods described above, thegenome of the modified plant cell is more than 99%, e.g., more than99.5% or more than 99.9% identical to the genome of the parent cell.

In yet another embodiment of the targeted modification methods describedabove, at least one of the targeted modifications is an insertion and atleast one insertion is in a region of the genome that is recalcitrant tomeiotic or mitotic recombination.

In certain embodiments of the plant cell genome targeting methodsdescribed above, the plant cell is a member of a pool of cells beingtargeted. In related embodiments, the modified cells within the pool arecharacterized by sequencing after targeting.

The invention also provides modified soybean plant cells comprising atleast two separately targeted insertions in its genome, wherein theinsertions are determined relative to a parent plant cell, and whereinthe modified plant cell is devoid of mitotically or meioticallygenerated genetic or epigenetic changes relative to the parent plantcell. In certain embodiments, these plant cells are obtained using themultiplex targeted insertion methods described above. In certainembodiments, the modified plant cells comprise at least two separatelytargeted insertions, wherein the genome of the modified plant cell is atleast 95%, at least 96%, at least 97%, at least 98%, at least 99%, atleast 99.5%, at least 99.9%, or at least 99.99% identical to the parentcell, taking all genetic or epigenetic changes into account.

While the introgression of certain traits and transgenes into plants hasbeen successful, achieving a homozygous modified plant in one step(i.e., modifying all targeted loci simultaneously) has not beenpreviously described. Plants homozygous for, e.g., targeted insertionscould only be obtained by further crossing and/or techniques involvingdouble-haploids. These techniques are not only time consuming andlaborious, they also lead to plants which deviate from the originalplant not only for the targeted insertion but also for other changes asa consequence of the techniques employed to enable homozygosity. As suchchanges could have unintended and unpredictable consequences and mayrequire further testing or screening, they are clearly undesired in abreeding process. In certain embodiments, the invention provides methodsof making a targeted mutation and/or targeted insertion in all of the 2ntargeted loci in a plant genome in one step. In embodiments, the two ormore loci are alleles of a given sequence of interest; when all allelesof a given gene or sequence of interest are modified in the same way,the result is homozygous modification of the gene. For example,embodiments of the method enable targeted modification of both allelesof a gene in a diploid (2n ploidy, where n=1) plant, or targetedmodification of all three alleles in a triploid (2n ploidy, where n=1.5)plant, or targeted modification of all six alleles of a gene in ahexaploid (2n ploidy, where n=3) plant.

The invention also provides modified plant cells resulting from any ofthe claimed methods described, as well as recombinant plants grown fromthose modified plant cells.

In some embodiments, the invention provides a method of manufacturing aprocessed plant product, comprising: (a) modifying a plant cellaccording to any of the targeted methods described above; (b) growing amodified plant from said plant cell, and (c) processing the modifiedplant into a processed product, thereby manufacturing a processed plantproduct. In related embodiments, the processed product may be meal, oil,juice, sugar, starch, fiber, an extract, wood or wood pulp, flour, clothor some other commodity plant product. The invention also provides amethod of manufacturing a plant product, comprising (a) modifying aplant cell according to any of the targeted methods described above, (b)growing an modified plant from said plant cell, and (c) harvesting aproduct of the modified plant, thereby manufacturing a plant product. Inrelated embodiments, the plant product is a product may be leaves,fruit, vegetables, nuts, seeds, oil, wood, flowers, cones, branches,hay, fodder, silage, stover, straw, pollen, or some other harvestedcommodity product. In further related embodiments, the processedproducts and harvested products are packaged.

DETAILED DESCRIPTION OF THE INVENTION Definitions

Unless otherwise stated, nucleic acid sequences in the text of thisspecification are given, when read from left to right, in the 5′ to 3′direction. Nucleic acid sequences may be provided as DNA or as RNA, asspecified; disclosure of one necessarily defines the other, as well asnecessarily defines the exact complements, as is known to one ofordinary skill in the art. Where a term is provided in the singular, theinventors also contemplate aspects of the invention described by theplural of that term.

By “polynucleotide” is meant a nucleic acid molecule containing multiplenucleotides and refers to “oligonucleotides” (defined here as apolynucleotide molecule of between 2-25 nucleotides in length) andpolynucleotides of 26 or more nucleotides. Polynucleotides are generallydescribed as single- or double-stranded. Where a polynucleotide containsdouble-stranded regions formed by intra- or intermolecularhybridization, the length of each double-stranded region is convenientlydescribed in terms of the number of base pairs. Aspects of thisinvention include the use of polynucleotides or compositions containingpolynucleotides; embodiments include one or more oligonucleotides orpolynucleotides or a mixture of both, including single- ordouble-stranded RNA or single- or double-stranded DNA or single- ordouble-stranded DNA/RNA hybrids or chemically modified analogues or amixture thereof. In various embodiments, a polynucleotide (such as asingle-stranded DNA/RNA hybrid or a double-stranded DNA/RNA hybrid)includes a combination of ribonucleotides and deoxyribonucleotides (e.g., synthetic polynucleotides consisting mainly of ribonucleotides butwith one or more terminal deoxyribonucleotides or syntheticpolynucleotides consisting mainly of deoxyribonucleotides but with oneor more terminal dideoxyribonucleotides), or includes non-canonicalnucleotides such as inosine, thiouridine, or pseudouridine. Inembodiments, the polynucleotide includes chemically modified nucleotides(see, e. g., Verma and Eckstein (1998) Annu. Rev. Biochem., 67:99-134);for example, the naturally occurring phosphodiester backbone of anoligonucleotide or polynucleotide can be partially or completelymodified with phosphorothioate, phosphorodithioate, or methylphosphonateinternucleotide linkage modifications; modified nucleoside bases ormodified sugars can be used in oligonucleotide or polynucleotidesynthesis; and oligonucleotides or polynucleotides can be labelled witha fluorescent moiety (e. g., fluorescein or rhodamine or a fluorescenceresonance energy transfer or FRET pair of chromophore labels) or otherlabel (e. g., biotin or an isotope). Modified nucleic acids,particularly modified RNAs, are disclosed in U.S. Pat. No. 9,464,124,incorporated by reference in its entirety herein. For somepolynucleotides (especially relatively short polynucleotides, e. g.,oligonucleotides of 2-25 nucleotides or base-pairs, or polynucleotidesof about 25 to about 300 nucleotides or base-pairs), use of modifiednucleic acids, such as locked nucleic acids (“LNAs”), is useful tomodify physical characteristics such as increased melting temperature(T_(m)) of a polynucleotide duplex incorporating DNA or RNA moleculesthat contain one or more LNAs; see, e. g., You et al. (2006) NucleicAcids Res., 34:1-11 (e60), doi:10.1093/nar/gkl175.

In the context of the genome targeting methods described herein, thephrase “contacting a genome” with an agent means that an agentresponsible for effecting the targeted genome modification (e.g., abreak, a deletion, a rearrangement, or an insertion) is delivered to theinterior of the cell so the directed mutagenic action can take place.

In the context of discussing or describing the ploidy of a plant cell,the “n” (as in “a ploidy of 2n”) refers to the number of homologouspairs of chromosomes, and is typically equal to the number of homologouspairs of gene loci on all chromosomes present in the cell.

The term “inbred variety” refers to a genetically homozygous orsubstantially homozygous population of plants that preferably compriseshomozygous alleles at about 95%, preferably 98.5% or more of its loci.An inbred line can be developed through inbreeding (i e., several cyclesof selfing, more preferably at least 5, 6, 7 or more cycles of selfing)or doubled haploidy resulting in a plant line with a high uniformity.Inbred lines breed true, e.g., for one or more or all phenotypic traitsof interest. An “inbred”, “inbred individual, or “inbred progeny” is anindividual sampled from an inbred line.

“F1, F2, F3, etc.” refers to the consecutive related generationsfollowing a cross between two parent plants or parent lines. The plantsgrown from the seeds produced by crossing two plants or lines is calledthe F1 generation. Selfing the F1 plants results in the F2 generation,etc. “F1 hybrid” plant (or F1 hybrid seed) is the generation obtainedfrom crossing two inbred parent lines. Thus, F1 hybrid seeds are seedsfrom which F1 hybrid plants grow. F1 hybrids are more vigorous andhigher yielding, due to heterosis.

Hybrid seed: Hybrid seed is seed produced by crossing two differentinbred lines (i.e. a female inbred line with a male inbred). Hybrid seedis heterozygous over a majority of its alleles.

As used herein, the term “variety” refers to a group of similar plantsthat by structural or genetic features and/or performance can bedistinguished from other varieties within the same species.

The term “cultivar” (for cultivated variety) is used herein to denote avariety that is not normally found in nature but that has been createdby humans, i.e., having a biological status other than a “wild” status,which “wild” status indicates the original non-cultivated, or naturalstate of a plant or accession. The term “cultivar” includes, but is notlimited to, semi-natural, semi-wild, weedy, traditional cultivar,landrace, breeding material, research material, breeder's line,synthetic population, hybrid, founder stock/base population, inbred line(parent of hybrid cultivar), segregating population, mutant/geneticstock, and advanced/improved cultivar. The term “elite background” isused herein to indicate the genetic context or environment of a targetedmutation of insertion.

The term “dihaploid line” refers to stable inbred lines issued fromanother culture. Some pollen grains (haploid) cultivated on specificmedium and circumstances can develop plantlets containing n chromosomes.These plantlets are then “double” and contain 2n chromosomes. Theprogeny of these plantlets are named “dihaploid” and are essentially notsegregating any more (i.e., they are stable).

“F1 hybrid” plant (or F1 hybrid seed) is the generation obtained fromcrossing two inbred parent lines. Thus, F1 hybrid seeds are seeds fromwhich F1 hybrid plants grow. F1 hybrids are more vigorous and higheryielding, due to heterosis. Inbred lines are essentially homozygous atmost loci in the genome. A “plant line” or “breeding line” refers to aplant and its progeny. “F1”, “F2”, “F3”, etc.” refers to the consecutiverelated generations following a cross between two parent plants orparent lines. The plants grown from the seeds produced by crossing twoplants or lines is called the F1 generation. Selfing the F1 plantsresults in the F2 generation, etc.

The term “allele(s)” means any of one or more alternative forms of agene at a particular locus, all of which alleles relate to one trait orcharacteristic at a specific locus. In a diploid cell of an organism,alleles of a given gene are located at a specific location, or locus(loci plural), on a chromosome. One allele is present on each chromosomeof the pair of homologous chromosomes. A diploid plant species maycomprise a large number of different alleles at a particular locus.These may be identical alleles of the gene (homozygous) or two differentalleles (heterozygous).

The term “locus” (loci plural) means a specific place or places or asite on a chromosome where for example a QTL, a gene or genetic markeris found.

The spontaneous (non-targeted) mutation rate for a single base pair isestimated to be 7×10⁻⁹ per bp per generation. Assuming an estimated 30replications per generation, this leads to an estimated spontaneous(non-targeted) mutation rate of 2×10⁻¹⁰ mutations per base pair perreplication event.

Tools and Methods for Multiplex Editing

CRISPR technology for editing the genes of eukaryotes is disclosed in U.S. Patent Application Publications 2016/0138008A1 and US2015/0344912A1,and in U.S. Pat. Nos. 8,697,359, 8,771,945, 8,945,839, 8,999,641,8,993,233, 8,895,308, 8,865,406, 8,889,418, 8,871,445, 8,889,356,8,932,814, 8,795,965, and 8,906,616. Cpf1 endonuclease and correspondingguide RNAs and PAM sites are disclosed in U. S. Patent ApplicationPublication 2016/0208243 A1. Other CRISPR nucleases useful for editinggenomes include C2c1 and C2c3 (see Shmakov et al. (2015) Mol. Cell,60:385-397) and CasX and CasY (see Burstein et al. (2016) Nature,doi:10.1038/nature21059). Plant RNA promoters for expressing CRISPRguide RNA and plant codon-optimized CRISPR Cas9 endonuclease aredisclosed in International Patent Application PCT/US2015/018104(published as WO 2015/131101 and claiming priority to U.S. ProvisionalPatent Application 61/945,700). Methods of using CRISPR technology forgenome editing in plants are disclosed in in U. S. Patent ApplicationPublications US 2015/0082478A1 and US 2015/0059010A1 and inInternational Patent Application PCT/US2015/038767 A1 (published as WO2016/007347 and claiming priority to U.S. Provisional Patent Application62/023,246).

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)/Cas(CRISPR-associated) systems, or CRISPR systems, are adaptive defensesystems originally discovered in bacteria and archaea. CRISPR systemsuse RNA-guided nucleases termed CRISPR-associated or “Cas” endonucleases(e. g., Cas9 or Cpf1) to cleave foreign DNA. In a typical CRISPR/Cassystem, a Cas endonuclease is directed to a target nucleotide sequence(e. g., a site in the genome that is to be sequence-edited) bysequence-specific, non-coding “guide RNAs” that target single- ordouble-stranded DNA sequences. In microbial hosts, CRISPR loci encodeboth Cas endonucleases and “CRISPR arrays” of the non-coding RNAelements that determine the specificity of the CRISPR-mediated nucleicacid cleavage.

Three classes (I-III) of CRISPR systems have been identified across awide range of bacterial hosts. The well characterized class II CRISPRsystems use a single Cas endonuclease (rather than multiple Casproteins). One class II CRISPR system includes a type II Casendonuclease such as Cas9, a CRISPR RNA (“crRNA”), and atrans-activating crRNA (“tracrRNA”). The crRNA contains a “guide RNA”,typically a 20-nucleotide RNA sequence that corresponds to (i. e., isidentical or nearly identical to, or alternatively is complementary ornearly complementary to) a 20-nucleotide target DNA sequence. The crRNAalso contains a region that binds to the tracrRNA to form a partiallydouble-stranded structure which is cleaved by RNase III, resulting in acrRNA/tracrRNA hybrid. The crRNA/tracrRNA hybrid then directs the Cas9endonuclease to recognize and cleave the target DNA sequence.

The target DNA sequence must generally be adjacent to a “protospaceradjacent motif” (“PAM”) that is specific for a given Cas endonuclease;however, PAM sequences are short and relatively non-specific, appearingthroughout a given genome. CRISPR endonucleases identified from variousprokaryotic species have unique PAM sequence requirements; examples ofPAM sequences include 5′-NGG (Streptococcus pyogenes), 5′-NNAGAA(Streptococcus thermophilus CRISPR1), 5′-NGGNG (Streptococcusthermophilus CRISPR3), 5′-NNGRRT or 5′-NNGRR (Staphylococcus aureusCas9, SaCas9), and 5′-NNNGATT (Neisseria meningitidis). Someendonucleases, e. g., Cas9 endonucleases, are associated with G-rich PAMsites, e. g., 5′-NGG, and perform blunt-end cleaving of the target DNAat a location 3 nucleotides upstream from (5′ from) the PAM site.

Another class II CRISPR system includes the type V endonuclease Cpf1,which is a smaller endonuclease than is Cas9; examples include AsCpf1(from Acidaminococcus sp.) and LbCpf1 (from Lachnospiraceae sp.).Cpf1-associated CRISPR arrays are processed into mature crRNAs withoutthe requirement of a tracrRNA; in other words, a Cpf1 system requiresonly the Cpf1 nuclease and a crRNA to cleave the target DNA sequence.Cpf1 endonucleases, are associated with T-rich PAM sites, e. g., 5′-TTN.Cpf1 can also recognize a 5′-CTA PAM motif. Cpf1 cleaves the target DNAby introducing an offset or staggered double-strand break with a 4- or5-nucleotide 5′ overhang, for example, cleaving a target DNA with a5-nucleotide offset or staggered cut located 18 nucleotides downstreamfrom (3′ from) from the PAM site on the coding strand and 23 nucleotidesdownstream from the PAM site on the complimentary strand; the5-nucleotide overhang that results from such offset cleavage allows moreprecise genome editing by DNA insertion by homologous recombination thanby insertion at blunt-end cleaved DNA. See, e. g., Zetsche et al. (2015)Cell, 163:759-771. Other CRISPR nucleases useful in methods andcompositions of the invention include C2c1 and C2c3 (see Shmakov et al.(2015) Mol. Cell, 60:385-397). Like other CRISPR nucleases, C2c1 fromAlicyclobacillus acidoterrestris (AacC2c1; amino acid sequence withaccession ID TOD7A2, deposited on-line atwww[dot]ncbi[dot]nlm[dot]nih[dot]gov/protein/1076761101) requires aguide RNA and PAM recognition site; C2c1 cleavage results in a staggeredseven-nucleotide DSB in the target DNA (see Yang et al. (2016) Cell,167:1814-1828.e12) and is reported to have high mismatch sensitivity,thus reducing off-target effects (see Liu et al. (2016) Mol. Cell,available on line atdx[dot]doi[dot]org/10[dot]1016/j[dot]molcel[dot]2016 [dot]11.040). Yetother CRISPR nucleases include nucleases identified from the genomes ofuncultivated microbes, such as CasX and CasY (e. g., a CRISPR-associatedprotein CasY from an uncultured Parcubacteria group bacterium, aminoacid sequence with accession ID APG80656, deposited on-line atwww[dot]ncbi[dot]nlm[dot]nih[dot]gov/protein/APG80656.1]); see Bursteinet al. (2016) Nature, doi:10.1038/nature21059.

For the purposes of gene editing, CRISPR arrays can be designed tocontain one or multiple guide RNA sequences corresponding to a desiredtarget DNA sequence; see, for example, Cong et al. (2013) Science,339:819-823; Ran et al. (2013) Nature Protocols, 8:2281-2308. At least16 or 17 nucleotides of gRNA sequence are required by Cas9 for DNAcleavage to occur; for Cpf1 at least 16 nucleotides of gRNA sequence areneeded to achieve detectable DNA cleavage and at least 18 nucleotides ofgRNA sequence were reported necessary for efficient DNA cleavage invitro; see Zetsche et al. (2015) Cell, 163:759-771. In practice, guideRNA sequences are generally designed to have a length of between 17-24nucleotides (frequently 19, 20, or 21 nucleotides) and exactcomplementarity (i. e., perfect base-pairing) to the targeted gene ornucleic acid sequence; guide RNAs having less than 100% complementarityto the target sequence can be used (e. g., a gRNA with a length of 20nucleotides and between 1-4 mismatches to the target sequence) but canincrease the potential for off-target effects. The design of effectiveguide RNAs for use in plant genome editing is disclosed in U. S. PatentApplication Publication 2015/0082478 A1, the entire specification ofwhich is incorporated herein by reference. More recently, efficient geneediting has been achieved using a chimeric “single guide RNA” (“sgRNA”),an engineered (synthetic) single RNA molecule that mimics a naturallyoccurring crRNA-tracrRNA complex and contains both a tracrRNA (forbinding the nuclease) and at least one crRNA (to guide the nuclease tothe sequence targeted for editing); see, for example, Cong et al. (2013)Science, 339:819-823; Xing et al. (2014) BMC Plant Biol., 14:327-340.Chemically modified sgRNAs have been demonstrated to be effective ingenome editing; see, for example, Hendel et al. (2015) Nature Biotech.,985-991.

CRISPR-type genome editing has value in various aspects of agricultureresearch and development. CRISPR elements, i.e., CRISPR endonucleasesand CRISPR single-guide RNAs, are useful in effecting genome editingwithout remnants of the CRISPR elements or selective genetic markersoccurring in progeny. Alternatively, genome-inserted CRISPR elements areuseful in plant lines adapted for multiplex genetic screening andbreeding. For instance, a plant species can be created to express one ormore of a CRISPR endonuclease such as a Cas9- or a Cpf1-typeendonuclease or combinations with unique PAM recognition sites. Cpf1endonuclease and corresponding guide RNAs and PAM sites are disclosed inU. S. Patent Application Publication 2016/0208243 A1, which isincorporated herein by reference for its disclosure of DNA encoding Cpf1endonucleases and guide RNAs and PAM sites. Introduction of one or moreof a wide variety of CRISPR guide RNAs that interact with CRISPRendonucleases integrated into a plant genome or otherwise provided to aplant is useful for genetic editing for providing desired phenotypes ortraits, for trait screening, or for trait introgression. Multipleendonucleases can be provided in expression cassettes with theappropriate promoters to allow multiple genome editing in a spatially ortemporally separated fashion in either in chromosome DNA or episome DNA.

Whereas wild-type Cas9 generates double-strand breaks (DSBs) at specificDNA sequences targeted by a gRNA, a number of CRISPR endonucleaseshaving modified functionalities are available, for example: (1) a“nickase” version of Cas9 generates only a single-strand break; (2) acatalytically inactive Cas9 (“dCas9”) does not cut the target DNA butinterferes with transcription; (3) dCas9 on its own or fused to arepressor peptide can repress gene expression; (4) dCas9 fused to anactivator peptide can activate or increase gene expression; (5) dCas9fused to FokI nuclease (“dCas9-FokI”) can be used to generate DSBs attarget sequences homologous to two gRNAs; and (6) dCas9 fused tohistone-modifying enzymes (e. g., histone acetyltransferases, histonemethyltransferases, histone deacetylases, and histone demethylases) canbe used to alter the epigenome in a site-specific manner, for example,by changing the methylation or acetylation status at a particular locus.See, e. g., the numerous CRISPR/Cas9 plasmids disclosed in and publiclyavailable from the Addgene repository (Addgene, 75 Sidney St., Suite550A, Cambridge, Mass. 02139; addgene[dot]org/crispr/). A “doublenickase” Cas9 that introduces two separate double-strand breaks, eachdirected by a separate guide RNA, is described as achieving moreaccurate genome editing by Ran et al. (2013) Cell, 154:1380-1389.

In some embodiments, the methods of targeted modification describedherein provide a means for avoiding unwanted epigenetic losses that canarise from tissue culturing modified plant cells (see, e.g., Stroud etal. eLife 2013; 2:e00354). Using the methods described herein in theabsence of tissue culture, a loss of epigenetic marking may occur inless than 0.01% of the genome. This contrasts with results obtained withplants where tissue culture methods may result in losses of DNAmethylation that occur, on average, as determined by bisulfatesequencing, at 1344 places that are on average 334 base pairs long,which means a loss of DNA methylation at an average of 0.1% of thegenome (Stroud, 2013). In other words, the loss in marks using thetargeted modification techniques described herein without tissue cultureis 10 times lower than the loss observed when tissue culture techniquesare relied on. In certain embodiments of the novel modified plant cellsdescribed herein, the modified plant cell or plant does not havesignificant losses of methylation compared to a non-modified parentplant cell or plant; in other words, the methylation pattern of thegenome of the modified plant cell or plant is not greatly different fromthe methylation pattern of the genome of the parent plant cell or plant;in embodiments, the difference between the methylation pattern of thegenome of the modified plant cell or plant and that of the parent plantcell or plant is less than 0.1%, 0.05%, 0.02%, or 0.01% of the genome,or less than 0.005% of the genome, or less than 0.001% of the genome(see, e. g., Stroud et al. (2013) eLife 2:e00354;doi:10.7554/eLife.00354).

CRISPR technology for editing the genes of eukaryotes is disclosed inU.S. Patent Application Publications 2016/0138008A1 andUS2015/0344912A1, and in U.S. Pat. Nos. 8,697,359, 8,771,945, 8,945,839,8,999,641, 8,993,233, 8,895,308, 8,865,406, 8,889,418, 8,871,445,8,889,356, 8,932,814, 8,795,965, and 8,906,616. Cpf1 endonuclease andcorresponding guide RNAs and PAM sites are disclosed in U.S. PatentApplication Publication 2016/0208243 A1. Plant RNA promoters forexpressing CRISPR guide RNA and plant codon-optimized CRISPR Cas9endonuclease are disclosed in International Patent ApplicationPCT/US2015/018104 (published as WO 2015/131101 and claiming priority toU.S. Provisional Patent Application 61/945,700). Methods of using CRISPRtechnology for genome editing in plants are disclosed in in U.S. PatentApplication Publications US 2015/0082478A1 and US 2015/0059010A1 and inInternational Patent Application PCT/US2015/038767 A1 (published as WO2016/007347 and claiming priority to U.S. Provisional Patent Application62/023,246). All of the patent publications referenced in this paragraphare incorporated herein by reference in their entirety.

In some embodiments, one or more vectors driving expression of one ormore polynucleotides encoding elements of a genome-editing system (e.g., encoding a guide RNA or a nuclease) are introduced into a plant cellor a plant protoplast, whereby these elements, when expressed, result inalteration of a target nucleotide sequence. In embodiments, a vectorcomprises a regulatory element such as a promoter operably linked to oneor more polynucleotides encoding elements of a genome-editing system. Insuch embodiments, expression of these polynucleotides can be controlledby selection of the appropriate promoter, particularly promotersfunctional in a plant cell; useful promoters include constitutive,conditional, inducible, and temporally or spatially specific promoters(e. g., a tissue specific promoter, a developmentally regulatedpromoter, or a cell cycle regulated promoter). In embodiments, thepromoter is operably linked to nucleotide sequences encoding multipleguide RNAs, wherein the sequences encoding guide RNAs are separated by acleavage site such as a nucleotide sequence encoding a microRNArecognition/cleavage site or a self-cleaving ribozyme (see, e. g.,Ferré-D'Amaré and Scott (2014) Cold Spring Harbor Perspectives Biol.,2:a003574). In embodiments, the promoter is a pol II promoter operablylinked to a nucleotide sequence encoding one or more guide RNAs. Inembodiments, the promoter operably linked to one or more polynucleotidesencoding elements of a genome-editing system is a constitutive promoterthat drives DNA expression in plant cells; in embodiments, the promoterdrives DNA expression in the nucleus or in an organelle such as achloroplast or mitochondrion. Examples of constitutive promoters includea CaMV 35S promoter as disclosed in U.S. Pat. Nos. 5,858,742 and5,322,938, a rice actin promoter as disclosed in U.S. Pat. No.5,641,876, a maize chloroplast aldolase promoter as disclosed in U.S.Pat. No. 7,151,204, and an opaline synthase (NOS) and octapine synthase(OCS) promoter from Agrobacterium tumefaciens. In embodiments, thepromoter operably linked to one or more polynucleotides encodingelements of a genome-editing system is a promoter from figwort mosaicvirus (FMV), a RUBISCO promoter, or a pyruvate phosphate dikinase (PDK)promoter, which is active in the chloroplasts of mesophyll cells. Othercontemplated promoters include cell-specific or tissue-specific ordevelopmentally regulated promoters, for example, a promoter that limitsthe expression of the nucleic acid targeting system to germline orreproductive cells (e. g., promoters of genes encoding DNA ligases,recombinases, replicases, or other genes specifically expressed ingermline or reproductive cells); in such embodiments, thenuclease-mediated genetic modification (e. g., chromosomal or episomaldouble-stranded DNA cleavage) is limited only those cells from which DNAis inherited in subsequent generations, which is advantageous where itis desirable that expression of the genome-editing system be limited inorder to avoid genotoxicity or other unwanted effects. All of the patentpublications referenced in this paragraph are incorporated herein byreference in their entirety.

In some embodiments, elements of a genome-editing system (e.g., anRNA-guided nuclease and a guide RNA) are operably linked to separateregulatory elements on separate vectors. In other embodiments, two ormore elements of a genome-editing system expressed from the same ordifferent regulatory elements or promoters are combined in a singlevector, optionally with one or more additional vectors providing anyadditional necessary elements of a genome-editing system not included inthe first vector. For example, multiple guide RNAs can be expressed fromone vector, with the appropriate RNA-guided nuclease expressed from asecond vector. In another example, one or more vectors for theexpression of one or more guide RNAs (e. g., crRNAs or sgRNAs) aredelivered to a cell (e. g., a plant cell or a plant protoplast) thatexpresses the appropriate RNA-guided nuclease, or to a cell thatotherwise contains the nuclease, such as by way of prior administrationthereto of a vector for in vivo expression of the nuclease.

Genome-editing system elements that are combined in a single vector maybe arranged in any suitable orientation, such as one element located 5′with respect to (“upstream” of) or 3′ with respect to (“downstream” of)a second element. The coding sequence of one element may be located onthe same or opposite strand of the coding sequence of a second element,and oriented in the same or opposite direction. In embodiments, theendonuclease and the nucleic acid-targeting guide RNA may be operablylinked to and expressed from the same promoter. In embodiments, a singlepromoter drives expression of a transcript encoding an endonuclease andthe guide RNA, embedded within one or more intron sequences (e. g., eachin a different intron, two or more in at least one intron, or all in asingle intron), which can be plant-derived; such use of introns isespecially contemplated when the expression vector is being transformedor transfected into a monocot plant cell or a monocot plant protoplast.

Expression vectors provided herein may contain a DNA segment near the 3′end of an expression cassette that acts as a signal to terminatetranscription and directs polyadenylation of the resultant mRNA. Such a3′ element is commonly referred to as a “3′-untranslated region” or“3′-UTR” or a “polyadenylation signal”. Useful 3′ elements include:Agrobacterium tumefaciens nos 3′, tml 3′, tmr 3′, tms 3′, ocs 3′, andtr7 3′ elements disclosed in U.S. Pat. No. 6,090,627, incorporatedherein by reference, and 3′ elements from plant genes such as the heatshock protein 17, ubiquitin, and fructose-1,6-biphosphatase genes fromwheat (Triticum aestivum), and the glutelin, lactate dehydrogenase, andbeta-tubulin genes from rice (Oryza sativa), disclosed in U. S. PatentApplication Publication 2002/0192813 A1, incorporated herein byreference.

In certain embodiments, a vector or an expression cassette includesadditional components, e. g., a polynucleotide encoding a drugresistance or herbicide gene or a polynucleotide encoding a detectablemarker such as green fluorescent protein (GFP) or beta-glucuronidase(gus) to allow convenient screening or selection of cells expressing thevector. In embodiments, the vector or expression cassette includesadditional elements for improving delivery to a plant cell or plantprotoplast or for directing or modifying expression of one or moregenome-editing system elements, for example, fusing a sequence encodinga cell-penetrating peptide, localization signal, transit, or targetingpeptide to the RNA-guided nuclease, or adding a nucleotide sequence tostabilize a guide RNA; such fusion proteins (and the polypeptidesencoding such fusion proteins) or combination polypeptides, as well asexpression cassettes and vectors for their expression in a cell, arespecifically claimed. In embodiments, an RNA-guided nuclease (e. g.,Cas9, Cpf1, CasY, CasX, C2c1, or C2c3) is fused to a localizationsignal, transit, or targeting peptide, e. g., a nuclear localizationsignal (NLS), a chloroplast transit peptide (CTP), or a mitochondrialtargeting peptide (MTP); in a vector or an expression cassette, thenucleotide sequence encoding any of these can be located either 5′and/or 3′ to the DNA encoding the nuclease. For example, aplant-codon-optimized Cas9 (pco-Cas9) from Streptococcus pyogenes and S.thermophilus containing nuclear localization signals and codon-optimizedfor expression in maize is disclosed in PCT/US2015/018104 (published asWO/2015/131101 and claiming priority to U.S. Provisional PatentApplication 61/945,700), incorporated herein by reference. In anotherexample, a chloroplast-targeting RNA is appended to the 5′ end of anmRNA encoding an endonuclease to drive the accumulation of the mRNA inchloroplasts; see Gomez, et al. (2010) Plant Signal Behav., 5:1517-1519. In an embodiment, a Cas9 from Streptococcus pyogenes is fusedto a nuclear localization signal (NLS), such as the NLS from SV40. In anembodiment, a Cas9 from Streptococcus pyogenes is fused to acell-penetrating peptide (CPP), such as octa-arginine or nona-arginineor a homoarginine 12-mer oligopeptide, or a CPP disclosed in thedatabase of cell-penetrating peptides CPPsite 2.0, publicly available atcrdd[dot]osdd[dot]net/raghava/cppsite/. In an example, a Cas9 fromStreptococcus pyogenes (which normally carries a net positive charge) ismodified at the N-terminus with a negatively charged glutamate peptide“tag” and at the C-terminus with a nuclear localization signal (NLS);when mixed with cationic arginine gold nanoparticles (ArgNPs),self-assembled nanoassemblies were formed which were shown to providegood editing efficiency in human cells; see Mout et al. (2017) ACS Nano,doi:10.1021/acsnano.6b07600. In an embodiment, a Cas9 from Streptococcuspyogenes is fused to a chloroplast transit peptide (CTP) sequence. Inembodiments, a CTP sequence is obtained from any nuclear gene thatencodes a protein that targets a chloroplast, and the isolated orsynthesized CTP DNA is appended to the 5′ end of the DNA that encodes anuclease targeted for use in a chloroplast. Chloroplast transit peptidesand their use are described in U.S. Pat. Nos. 5,188,642, 5,728,925, and8,420,888, all of which are incorporated herein by reference in theirentirety. Specifically, the CTP nucleotide sequences provided with thesequence identifier (SEQ ID) numbers 12-15 and 17-22 of U.S. Pat. No.8,420,888 are incorporated herein by reference. In an embodiment, a Cas9from Streptococcus pyogenes is fused to a mitochondrial targetingpeptide (MTP), such as a plant MTP sequence; see, e. g., Jores et al.(2016) Nature Communications, 7:12036-12051.

Plasmids designed for use in plants and encoding CRISPR genome editingelements (CRISPR nucleases and guide RNAs) are publicly available fromplasmid repositories such as Addgene (Cambridge, Mass.; also see“addgene[dot]com”) or can be designed using publicly disclosedsequences, e. g., sequences of CRISPR nucleases. In embodiments, suchplasmids are used to co-express both CRISPR nuclease mRNA and guideRNA(s); in other embodiments, CRISPR endonuclease mRNA and guide RNA areencoded on separate plasmids. In embodiments, the plasmids areAgrobacterium TI plasmids. Materials and methods for preparingexpression cassettes and vectors for CRISPR endonuclease and guide RNAfor stably integrated and/or transient plant transformation aredisclosed in PCT/US2015/018104 (published as WO/2015/131101 and claimingpriority to U.S. Provisional Patent Application 61/945,700), U. S.Patent Application Publication 2015/0082478 A1, and PCT/US2015/038767(published as WO/2016/007347 and claiming priority to U.S. ProvisionalPatent Application 62/023,246), all of which are incorporated herein byreference in their entirety. In embodiments, such expression cassettesare isolated linear fragments, or are part of a larger construct thatincludes bacterial replication elements and selectable markers; suchembodiments are useful, e. g., for particle bombardment or nanoparticledelivery or protoplast transformation. In embodiments, the expressioncassette is adjacent to or located between T-DNA borders or containedwithin a binary vector, e. g., for Agrobacterium-mediatedtransformation. In embodiments, a plasmid encoding a CRISPR nuclease isdelivered to cell (such as a plant cell or a plant protoplast) forstable integration of the CRISPR nuclease into the genome of cell, oralternatively for transient expression of the CRISPR nuclease. Inembodiments, plasmids encoding a CRISPR nuclease are delivered to aplant cell or a plant protoplast to achieve stable or transientexpression of the CRISPR nuclease, and one or multiple guide RNAs (suchas a library of individual guide RNAs or multiple pooled guide RNAs) orplasmids encoding the guide RNAs are delivered to the plant cell orplant protoplast individually or in combinations, thus providinglibraries or arrays of plant cells or plant protoplasts (or of plantcallus or whole plants derived therefrom), in which a variety of genomeedits are provided by the different guide RNAs. A pool or arrayedcollection of diverse modified plant cells comprising subsets oftargeted modifications (e.g., a collection of plant cells or plantswhere some plants are homozygous and some are heterozygous for one, two,three or more targeted modifications) can be compared to determine thefunction of modified sequences (e.g., mutated or deleted sequences orgenes) or the function of sequences being inserted. In other words, themethods and tools described herein can be used to perform “reversegenetics.”

In certain embodiments where the genome-editing system is a CRISPRsystem, expression of the guide RNA is driven by a plant U6 spliceosomalRNA promoter, which can be native to the genome of the plant cell orfrom a different species, e. g., a U6 promoter from maize, tomato, orsoybean such as those disclosed in PCT/US2015/018104 (published as WO2015/131101 and claiming priority to U.S. Provisional Patent Application61/945,700), incorporated herein by reference, or a homologue thereof;such a promoter is operably linked to DNA encoding the guide RNA fordirecting an endonuclease, followed by a suitable 3′ element such as aU6 poly-T terminator. In another embodiment, an expression cassette forexpressing guide RNAs in plants is used, wherein the promoter is a plantU3, 7SL (signal recognition particle RNA), U2, or U5 promoter, orchimerics thereof, e. g., as described in PCT/US2015/018104 (publishedas WO 2015/131101 and claiming priority to U.S. Provisional PatentApplication 61/945,700), incorporated herein by reference. When multipleor different guide RNA sequences are used, a single expression constructmay be used to correspondingly direct the genome editing activity to themultiple or different target sequences in a cell, such a plant cell or aplant protoplast. In various embodiments, a single vector includes 1, 2,3, 4, 5, 6, 7, 8, 9, 10, about 15, about 20, or more guide RNAsequences; in other embodiments, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, about15, about 20, or more guide RNA sequences are provided on multiplevectors, which can be delivered to one or multiple plant cells or plantprotoplasts (e. g., delivered to an array of plant cells or plantprotoplasts, or to a pooled population of plant cells or plantprotoplasts).

In embodiments, one or more guide RNAs and the corresponding RNA-guidednuclease are delivered together or simultaneously. In other embodiments,one or more guide RNAs and the corresponding RNA-guided nuclease aredelivered separately; these can be delivered in separate, discrete stepsand using the same or different delivery techniques. In an example, anRNA-guided nuclease is delivered to a cell (such as a plant cell orplant protoplast) by particle bombardment, on carbon nanotubes, or byAgrobacterium-mediated transformation, and one or more guide RNAs isdelivered to the cell in a separate step using the same or differentdelivery technique. In embodiments, an RNA-guided nuclease encoded by aDNA molecule or an mRNA is delivered to a cell with enough time prior todelivery of the guide RNA to permit expression of the nuclease in thecell; for example, an RNA-guided nuclease encoded by a DNA molecule oran mRNA is delivered to a plant cell or plant protoplast between 1-12hours (e. g., about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 hours, orbetween about 1-6 hours or between about 2-6 hours) prior to thedelivery of the guide RNA to the plant cell or plant protoplast. Inembodiments, whether the RNA-guided nuclease is delivered simultaneouslywith or separately from an initial dose of guide RNA, succeeding“booster” doses of guide RNA are delivered subsequent to the delivery ofthe initial dose; for example, a second “booster” dose of guide RNA isdelivered to a plant cell or plant protoplast between 1-12 hours (e. g.,about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 hours, or between about1-6 hours or between about 2-6 hours) subsequent to the delivery of theinitial dose of guide RNA to the plant cell or plant protoplast.Similarly, in some embodiments, multiple deliveries of an RNA-guidednuclease or of a DNA molecule or an mRNA encoding an RNA-guided nucleaseare used to increase efficiency of the genome modification.

In embodiments, the desired genome modification involves non-homologousrecombination, in this case non-homologous end-joining of genomicsequence across one or more introduced double-strand breaks; generally,such embodiments do not require a donor template having homology “arms”(regions of homologous or complimentary sequence to genomic sequenceflanking the site of the DSB). In various embodiments described herein,donor polynucleotides encoding sequences for targeted insertion atdouble-stranded breaks are single-stranded polynucleotides comprisingRNA or DNA or both types of nucleotides; or the donor polynucleotidesare at least partially double-stranded and comprise RNA, DNA or bothtypes of nucleotides. Other modified nucleotides may also be used.

In other embodiments, the desired genome modification involveshomologous recombination, wherein one or more double-stranded DNA breakin the target nucleotide sequence is generated by the RNA-guidednuclease and guide RNA(s), followed by repair of the break(s) using ahomologous recombination mechanism (“homology-directed repair”). In suchembodiments, a donor template that encodes the desired nucleotidesequence to be inserted or knocked-in at the double-stranded break andgenerally having homology “arms” (regions of homologous or complimentarysequence to genomic sequence flanking the site of the DSB) is providedto the cell (such as a plant cell or plant protoplast); examples ofsuitable templates include single-stranded DNA templates anddouble-stranded DNA templates (e. g., in the form of a plasmid). Ingeneral, a donor template encoding a nucleotide change over a region ofless than about 50 nucleotides is conveniently provided in the form ofsingle-stranded DNA; larger donor templates (e. g., more than 100nucleotides) are often conveniently provided as double-stranded DNAplasmids.

In certain embodiments directed to the targeted incorporation ofsequences by homologous recombination, a donor template has a corenucleotide sequence that differs from the target nucleotide sequence (e.g., a homologous endogenous genomic region) by at least 1, at least 5,at least 10, at least 20, at least 30, at least 40, at least 50, or morenucleotides. This core sequence is flanked by “homology arms” or regionsof high sequence identity with the targeted nucleotide sequence; inembodiments, the regions of high identity include at least 10, at least50, at least 100, at least 150, at least 200, at least 300, at least400, at least 500, at least 600, at least 750, or at least 1000nucleotides on each side of the core sequence. In embodiments where thedonor template is in the form of a single-stranded DNA, the coresequence is flanked by homology arms including at least 10, at least 20,at least 30, at least 40, at least 50, at least 60, at least 70, atleast 80, or at least 100 nucleotides on each side of the core sequence.In embodiments where the donor template is in the form of adouble-stranded DNA plasmid, the core sequence is flanked by homologyarms including at least 500, at least 600, at least 700, at least 800,at least 900, or at least 1000 nucleotides on each side of the coresequence. In an embodiment, two separate double-strand breaks areintroduced into the cell's target nucleotide sequence with a “doublenickase” Cas9 (see Ran et al. (2013) Cell, 154:1380-1389), followed bydelivery of the donor template.

Delivery Methods and Agents

Aspects of the invention involve various treatments employed to deliverto a plant cell or protoplast a guide RNA (gRNA), such as a crRNA orsgRNA (or a polynucleotide encoding such), and/or a polynucleotideencoding a sequence for targeted insertion at a double-strand break in agenome. In embodiments, one or more treatments are employed to deliverthe gRNA into a plant cell or plant protoplast, e. g., through barrierssuch as a cell wall or a plasma membrane or nuclear envelope or otherlipid bilayer.

Unless otherwise stated, the various compositions and methods describedherein for delivering guide RNAs and nucleases to a plant cell orprotoplast are also generally useful for delivering donorpolynucleotides to the cell. The delivery of donor polynucleotides canbe simultaneous with, or separate from (generally after) delivery of thenuclease and guide RNA to the cell. For example, a donor polynucleotidecan be transiently introduced into a plant cell or plant protoplast,optionally with the nuclease and/or gRNA; in certain embodiments, thedonor template is provided to the plant cell or plant protoplast in aquantity that is sufficient to achieve the desired insertion of thedonor polynucleotide sequence but donor polynucleotides do not persistin the plant cell or plant protoplast after a given period of time (e.g., after one or more cell division cycles).

In certain embodiments, a gRNA- or donor polynucleotide, in addition toother agents involved in targeted modifications, can be delivered to aplant cell or protoplast by directly contacting the plant cell orprotoplast with a composition comprising the gRNA(s) or donorpolynucleotide(s). For example, a gRNA-containing composition in theform of a liquid, a solution, a suspension, an emulsion, a reverseemulsion, a colloid, a dispersion, a gel, liposomes, micelles, aninjectable material, an aerosol, a solid, a powder, a particulate, ananoparticle, or a combination thereof can be applied directly to aplant cell (or plant part or tissue containing the plant cell) or plantprotoplast (e. g., through abrasion or puncture or otherwise disruptionof the cell wall or cell membrane, by spraying or dipping or soaking orotherwise directly contacting, or by microinjection). In certainembodiments, a plant cell (or plant part or tissue containing the plantcell) or plant protoplast is soaked in a liquid gRNA-containingcomposition, whereby the gRNA is delivered to the plant cell or plantprotoplast. In embodiments, the gRNA-containing composition is deliveredusing negative or positive pressure, for example, using vacuuminfiltration or application of hydrodynamic or fluid pressure. Inembodiments, the gRNA-containing composition is introduced into a plantcell or plant protoplast, e. g., by microinjection or by disruption ordeformation of the cell wall or cell membrane, for example by physicaltreatments such as by application of negative or positive pressure,shear forces, or treatment with a chemical or physical delivery agentsuch as surfactants, liposomes, or nanoparticles; see, e. g., deliveryof materials to cells employing microfluidic flow through acell-deforming constriction as described in US Published PatentApplication 2014/0287509, incorporated by reference in its entiretyherein. Other techniques useful for delivering the gRNA-containingcomposition to a plant cell or plant protoplast include: ultrasound orsonication; vibration, friction, shear stress, vortexing, cavitation;centrifugation or application of mechanical force; mechanical cell wallor cell membrane deformation or breakage; enzymatic cell wall or cellmembrane breakage or permeabilization; abrasion or mechanicalscarification (e. g., abrasion with carborundum or other particulateabrasive or scarification with a file or sandpaper) or chemicalscarification (e. g., treatment with an acid or caustic agent); andelectroporation. In embodiments, the gRNA-containing composition isprovided by bacterially mediated (e. g., Agrobacterium sp., Rhizobiumsp., Sinorhizobium sp., Mesorhizobium sp., Bradyrhizobium sp., Azobactersp., Phyllobacterium sp.) transfection of the plant cell or plantprotoplast with a polynucleotide encoding the gRNA; see, e. g.,Broothaerts et al. (2005) Nature, 433:629-633. Any of these techniquesor a combination thereof are alternatively employed on the plant part ortissue or intact plant (or seed) from which a plant cell or plantprotoplast is optionally subsequently obtained or isolated; inembodiments, the gRNA-containing composition is delivered in a separatestep after the plant cell or plant protoplast has been obtained orisolated.

In embodiments, a treatment employed in delivery of a gRNA to a plantcell or plant protoplast is carried out under a specific thermal regime,which can involve one or more appropriate temperatures, e. g., chillingor cold stress (exposure to temperatures below that at which normalplant growth occurs), or heating or heat stress (exposure totemperatures above that at which normal plant growth occurs), ortreating at a combination of different temperatures. In embodiments, aspecific thermal regime is carried out on the plant cell or plantprotoplast, or on a plant or plant part from which a plant cell or plantprotoplast is subsequently obtained or isolated, in one or more stepsseparate from the gRNA delivery.

In embodiments, a whole plant or plant part or seed, or an isolatedplant cell or plant protoplast, or the plant or plant part from which aplant cell or plant protoplast is obtained or isolated, is treated withone or more delivery agents which can include at least one chemical,enzymatic, or physical agent, or a combination thereof. In embodiments,a gRNA-containing composition further includes one or more one chemical,enzymatic, or physical agent for delivery. In embodiments that furtherinclude the step of providing an RNA-guided nuclease or a polynucleotidethat encodes the RNA-guided nuclease, a gRNA-containing compositionincluding the RNA-guided nuclease or polynucleotide that encodes theRNA-guided nuclease further includes one or more one chemical,enzymatic, or physical agent for delivery. Treatment with the chemical,enzymatic or physical agent can be carried out simultaneously with thegRNA delivery, with the RNA-guided nuclease delivery, or in one or moreseparate steps that precede or follow the gRNA delivery or theRNA-guided nuclease delivery. In embodiments, a chemical, enzymatic, orphysical agent, or a combination of these, is associated or complexedwith the polynucleotide composition, with the gRNA or polynucleotidethat encodes or is processed to the gRNA, or with the RNA-guidednuclease or polynucleotide that encodes the RNA-guided nuclease;examples of such associations or complexes include those involvingnon-covalent interactions (e. g., ionic or electrostatic interactions,hydrophobic or hydrophilic interactions, formation of liposomes,micelles, or other heterogeneous composition) and covalent interactions(e. g., peptide bonds, bonds formed using cross-linking agents). Innon-limiting examples, a gRNA or polynucleotide that encodes or isprocessed to the gRNA is provided as a liposomal complex with a cationiclipid; a gRNA or polynucleotide that encodes or is processed to the gRNAis provided as a complex with a carbon nanotube; and an RNA-guidednuclease is provided as a fusion protein between the nuclease and acell-penetrating peptide. Examples of agents useful for delivering agRNA or polynucleotide that encodes or is processed to the gRNA or anuclease or polynucleotide that encodes the nuclease include the variouscationic liposomes and polymer nanoparticles reviewed by Zhang et al.(2007) J. Controlled Release, 123:1-10, and the cross-linkedmultilamellar liposomes described in U. S. Patent ApplicationPublication 2014/0356414 A1, incorporated by reference in its entiretyherein.

In embodiments, the chemical agent is at least one selected from thegroup consisting of:

-   -   (a) solvents (e. g., water, dimethylsulfoxide,        dimethylformamide, acetonitrile, N-pyrrolidine, pyridine,        hexamethylphosphoramide, alcohols, alkanes, alkenes, dioxanes,        polyethylene glycol, and other solvents miscible or emulsifiable        with water or that will dissolve phosphonucleotides in        non-aqueous systems);    -   (b) fluorocarbons (e. g., perfluorodecalin,        perfluoromethyldecalin);    -   (c) glycols or polyols (e. g., propylene glycol, polyethylene        glycol);    -   (d) surfactants, including cationic surfactants, anionic        surfactants, non-ionic surfactants, and amphiphilic        surfactants, e. g., alkyl or aryl sulfates, phosphates,        sulfonates, or carboxylates; primary, secondary, or tertiary        amines; quaternary ammonium salts; sultaines, betaines; cationic        lipids; phospholipids; tallowamine; bile acids such as cholic        acid; long chain alcohols; organosilicone surfactants including        nonionic organosilicone surfactants such as trisiloxane        ethoxylate surfactants or a silicone polyether copolymer such as        a copolymer of polyalkylene oxide modified heptamethyl        trisiloxane and allyloxypolypropylene glycol methylether        (commercially available as SILWET L77™ brand surfactant having        CAS Number 27306-78-1 and EPA Number CAL. REG. NO.        5905-50073-AA, Momentive Performance Materials, Inc., Albany,        N.Y.); specific examples of useful surfactants include sodium        lauryl sulfate, the Tween series of surfactants, Triton-X100,        Triton-X114, CHAPS and CHAPSO, Tergitol-type NP-40, Nonidet        P-40;    -   (e) lipids, lipoproteins, lipopolysaccharides;    -   (f) acids, bases, caustic agents;    -   (g) peptides, proteins, or enzymes (e. g., cellulase,        pectolyase, maceroenzyme, pectinase), including cell-penetrating        or pore-forming peptides (e. g., (BO100)2K8, Genscript;        poly-lysine, poly-arginine, or poly-homoarginine peptides; gamma        zein, see U. S. Patent Application publication 2011/0247100,        incorporated herein by reference in its entirety; transcription        activator of human immunodeficiency virus type 1 (“HIV-1 Tat”)        and other Tat proteins, see, e. g.,        www[dot]lifetein[dot]com/Cell_Penetrating_Peptides[dot]html and        Järver (2012) Mol. Therapy-Nucleic Acids, 1:e27, 1-17);        octa-arginine or nona-arginine; poly-homoarginine (see Unnamalai        et al. (2004) FEBS Letters, 566:307-310); see also the database        of cell-penetrating peptides CPPsite 2.0 publicly available at        crdd[dot]osdd[dot]net/raghava/cppsite/(h)    -   (h) RNase inhibitors;    -   (i) cationic branched or linear polymers such as chitosan,        poly-lysine, DEAE-dextran, polyvinylpyrrolidone (“PVP”), or        polyethylenimine (“PEI”, e. g., PEI, branched, MW 25,000, CAS        #9002-98-6; PEI, linear, MW 5000, CAS #9002-98-6; PEI linear, MW        2500, CAS #9002-98-6);    -   (j) dendrimers (see, e. g., U. S. Patent Application Publication        2011/0093982, incorporated herein by reference in its entirety);    -   (k) counter-ions, amines or polyamines (e. g., spermine,        spermidine, putrescine), osmolytes, buffers, and salts (e. g.,        calcium phosphate, ammonium phosphate);    -   (l) polynucleotides (e. g., non-specific double-stranded DNA,        salmon sperm DNA);    -   (m) transfection agents (e. g., Lipofectin®, Lipofectamine®, and        Oligofectamine®, and Invivofectamine® (all from Thermo Fisher        Scientific, Waltham, Mass.), PepFect (see Ezzat et al. (2011)        Nucleic Acids Res., 39:5284-5298), Transit® transfection        reagents (Minis Bio, LLC, Madison, Wis.), and poly-lysine,        poly-homoarginine, and poly-arginine molecules including        octo-arginine and nono-arginine as described in Lu et        al. (2010) J. Agric. Food Chem., 58:2288-2294);    -   (n) antibiotics, including non-specific DNA        double-strand-break-inducing agents (e. g., phleomycin,        bleomycin, talisomycin); and    -   (o) antioxidants (e. g., glutathione, dithiothreitol,        ascorbate).

In embodiments, the chemical agent is provided simultaneously with thegRNA (or polynucleotide encoding the gRNA or that is processed to thegRNA), for example, the polynucleotide composition including the gRNAfurther includes one or more chemical agent. In embodiments, the gRNA orpolynucleotide encoding the gRNA or that is processed to the gRNA iscovalently or non-covalently linked or complexed with one or morechemical agent; for example, the gRNA or polynucleotide encoding thegRNA or that is processed to the gRNA can be covalently linked to apeptide or protein (e. g., a cell-penetrating peptide or a pore-formingpeptide) or non-covalently complexed with cationic lipids, polycations(e. g., polyamines), or cationic polymers (e. g., PEI). In embodiments,the gRNA or polynucleotide encoding the gRNA or that is processed to thegRNA is complexed with one or more chemical agents to form, e. g., asolution, liposome, micelle, emulsion, reverse emulsion, suspension,colloid, or gel.

In embodiments, the physical agent is at least one selected from thegroup consisting of particles or nanoparticles (e. g., particles ornanoparticles made of materials such as carbon, silicon, siliconcarbide, gold, tungsten, polymers, or ceramics) in various size rangesand shapes, magnetic particles or nanoparticles (e. g., silenceMagMagnetotransfection™ agent, OZ Biosciences, San Diego, Calif.), abrasiveor scarifying agents, needles or microneedles, matrices, and grids. Inembodiments, particulates and nanoparticulates are useful in delivery ofthe polynucleotide composition or the nuclease or both. Usefulparticulates and nanoparticles include those made of metals (e. g.,gold, silver, tungsten, iron, cerium), ceramics (e. g., aluminum oxide,silicon carbide, silicon nitride, tungsten carbide), polymers (e. g.,polystyrene, polydiacetylene, and poly(3,4-ethylenedioxythiophene)hydrate), semiconductors (e. g., quantum dots), silicon (e. g., siliconcarbide), carbon (e. g., graphite, graphene, graphene oxide, or carbonnanosheets, nanocomplexes, or nanotubes), and composites (e. g.,polyvinylcarbazole/graphene, polystyrene/graphene, platinum/graphene,palladium/graphene nanocomposites). In embodiments, such particulatesand nanoparticulates are further covalently or non-covalentlyfunctionalized, or further include modifiers or cross-linked materialssuch as polymers (e. g., linear or branched polyethylenimine,poly-lysine), polynucleotides (e. g., DNA or RNA), polysaccharides,lipids, polyglycols (e. g., polyethylene glycol, thiolated polyethyleneglycol), polypeptides or proteins, and detectable labels (e. g., afluorophore, an antigen, an antibody, or a quantum dot). In variousembodiments, such particulates and nanoparticles are neutral, or carry apositive charge, or carry a negative charge. Embodiments of compositionsincluding particulates include those formulated, e. g., as liquids,colloids, dispersions, suspensions, aerosols, gels, and solids.Embodiments include nanoparticles affixed to a surface or support, e.g., an array of carbon nanotubes vertically aligned on a silicon orcopper wafer substrate. Embodiments include polynucleotide compositionsincluding particulates (e. g., gold or tungsten or magnetic particles)delivered by a Biolistic-type technique or with magnetic force. The sizeof the particles used in Biolistics is generally in the “microparticle”range, for example, gold microcarriers in the 0.6, 1.0, and 1.6micrometer size ranges (see, e. g., instruction manual for the Helios®Gene Gun System, Bio-Rad, Hercules, Calif.; Randolph-Anderson et al.(2015) “Sub-micron gold particles are superior to larger particles forefficient Biolistic® transformation of organelles and some cell types”,Bio-Rad US/EG Bulletin 2015), but successful Biolistics delivery usinglarger (40 nanometer) nanoparticles has been reported in cultured animalcells; see O'Brian and Lummis (2011) BMC Biotechnol., 11:66-71. Otherembodiments of useful particulates are nanoparticles, which aregenerally in the nanometer (nm) size range or less than 1 micrometer, e.g., with a diameter of less than about 1 nm, less than about 3 nm, lessthan about 5 nm, less than about 10 nm, less than about 20 nm, less thanabout 40 nm, less than about 60 nm, less than about 80 nm, and less thanabout 100 nm. Specific, non-limiting embodiments of nanoparticlescommercially available (all from Sigma-Aldrich Corp., St. Louis, Mo.)include gold nanoparticles with diameters of 5, 10, or 15 nm; silvernanoparticles with particle sizes of 10, 20, 40, 60, or 100 nm;palladium “nanopowder” of less than 25 nm particle size; single-,double-, and multi-walled carbon nanotubes, e. g., with diameters of0.7-1.1, 1.3-2.3, 0.7-0.9, or 0.7-1.3 nm, or with nanotube bundledimensions of 2-10 nm by 1-5 micrometers, 6-9 nm by 5 micrometers, 7-15nm by 0.5-10 micrometers, 7-12 nm by 0.5-10 micrometers, 110-170 nm by5-9 micrometers, 6-13 nm by 2.5-20 micrometers. Embodiments includepolynucleotide compositions including materials such as gold, silicon,cerium, or carbon, e. g., gold or gold-coated nanoparticles, siliconcarbide whiskers, carborundum, porous silica nanoparticles,gelatin/silica nanoparticles, nanoceria or cerium oxide nanoparticles(CNPs), carbon nanotubes (CNTs) such as single-, double-, ormulti-walled carbon nanotubes and their chemically functionalizedversions (e. g., carbon nanotubes functionalized with amide, amino,carboxylic acid, sulfonic acid, or polyethylene glycol moeities), andgraphene or graphene oxide or graphene complexes; see, for example, Wonget al. (2016) Nano Lett., 16:1161-1172; Giraldo et al. (2014) NatureMaterials, 13:400-409; Shen et al. (2012) Theranostics, 2:283-294; Kimet al. (2011) Bioconjugate Chem., 22:2558-2567; Wang et al. (2010) 1 Am.Chem. Soc. Comm., 132:9274-9276; Zhao et al. (2016) Nanoscale Res.Lett., 11:195-203; and Choi et al. (2016) J. Controlled Release,235:222-235. See also, for example, the various types of particles andnanoparticles, their preparation, and methods for their use, e. g., indelivering polynucleotides and polypeptides to cells, disclosed in U. S.Patent Application Publications 2010/0311168, 2012/0023619,2012/0244569, 2013/0145488, 2013/0185823, 2014/0096284, 2015/0040268,2015/0047074, and 2015/0208663, all of which are incorporated herein byreference in their entirety.

In embodiments wherein the gRNA (or polynucleotide encoding the gRNA) isprovided in a composition that further includes an RNA-guided nuclease(or a polynucleotide that encodes the RNA-guided nuclease), or whereinthe method further includes the step of providing an RNA-guided nucleaseor a polynucleotide that encodes the RNA-guided nuclease, one or moreone chemical, enzymatic, or physical agent can similarly be employed. Inembodiments, the RNA-guided nuclease (or polynucleotide encoding theRNA-guided nuclease) is provided separately, e. g., in a separatecomposition including the RNA-guided nuclease or polynucleotide encodingthe RNA-guided nuclease. Such compositions can include other chemical orphysical agents (e. g., solvents, surfactants, proteins or enzymes,transfection agents, particulates or nanoparticulates), such as thosedescribed above as useful in the polynucleotide composition used toprovide the gRNA. For example, porous silica nanoparticles are usefulfor delivering a DNA recombinase into maize cells; see, e. g.,Martin-Ortigosa et al. (2015) Plant Physiol., 164:537-547. In anembodiment, the polynucleotide composition includes a gRNA and Cas9nuclease, and further includes a surfactant and a cell-penetratingpeptide. In an embodiment, the polynucleotide composition includes aplasmid that encodes both an RNA-guided nuclease and at least on gRNA,and further includes a surfactant and carbon nanotubes. In anembodiment, the polynucleotide composition includes multiple gRNAs andan mRNA encoding the RNA-guided nuclease, and further includes goldparticles, and the polynucleotide composition is delivered to a plantcell or plant protoplast by Biolistics.

In related embodiments, one or more one chemical, enzymatic, or physicalagent can be used in one or more steps separate from (preceding orfollowing) that in which the gRNA is provided. In an embodiment, theplant or plant part from which a plant cell or plant protoplast isobtained or isolated is treated with one or more one chemical,enzymatic, or physical agent in the process of obtaining or isolatingthe plant cell or plant protoplast. In embodiments, the plant or plantpart is treated with an abrasive, a caustic agent, a surfactant such asSilwet L-77 or a cationic lipid, or an enzyme such as cellulase.

In embodiments, a gRNA is delivered to plant cells or plant protoplastsprepared or obtained from a plant, plant part, or plant tissue that hasbeen treated with the polynucleotide compositions (and optionally thenuclease). In embodiments, one or more one chemical, enzymatic, orphysical agent, separately or in combination with the polynucleotidecomposition, is provided/applied at a location in the plant or plantpart other than the plant location, part, or tissue from which the plantcell or plant protoplast is obtained or isolated. In embodiments, thepolynucleotide composition is applied to adjacent or distal cells ortissues and is transported (e. g., through the vascular system or bycell-to-cell movement) to the meristem from which plant cells or plantprotoplasts are subsequently isolated. In embodiments, a gRNA-containingcomposition is applied by soaking a seed or seed fragment or zygotic orsomatic embryo in the gRNA-containing composition, whereby the gRNA isdelivered to the seed or seed fragment or zygotic or somatic embryo fromwhich plant cells or plant protoplasts are subsequently isolated. Inembodiments, a flower bud or shoot tip is contacted with agRNA-containing composition, whereby the gRNA is delivered to cells inthe flower bud or shoot tip from which plant cells or plant protoplastsare subsequently isolated. In embodiments, a gRNA-containing compositionis applied to the surface of a plant or of a part of a plant (e. g., aleaf surface), whereby the gRNA is delivered to tissues of the plantfrom which plant cells or plant protoplasts are subsequently isolated.In embodiments a whole plant or plant tissue is subjected to particle-or nanoparticle-mediated delivery (e. g., Biolistics or carbon nanotubeor nanoparticle delivery) of a gRNA-containing composition, whereby thegRNA is delivered to cells or tissues from which plant cells or plantprotoplasts are subsequently isolated.

Methods of Modulating Expression of a Sequence of Interest in a Genome

In one aspect, the invention provides a method of changing expression ofa sequence of interest in a genome, including integrating a sequenceencoded by a polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule at the site of at leastone double-strand break (DSB) in a genome. The method permitssite-specific integration of heterologous sequence at the site of atleast one DSB, and thus at one or more locations in a genome, such as agenome of a plant cell. In embodiments, the genome is that of a nucleus,mitochondrion, or plastid in a plant cell.

By “integration of heterologous sequence” is meant integration orinsertion of one or more nucleotides, resulting in a sequence (includingthe inserted nucleotide(s) as well as at least some adjacent nucleotidesof the genomic sequence flanking the site of insertion at the DSB) thatis heterologous, i. e., would not otherwise or does not normally occurat the site of insertion. (The term “heterologous” is also used to referto a given sequence in relationship to another—e. g., the sequence ofthe polynucleotide donor molecule is heterologous to the sequence at thesite of the DSB wherein the polynucleotide is integrated.)

The at least one DSB is introduced into the genome by any suitabletechnique; in embodiments one or more DSBs is introduced into the genomein a site- or sequence-specific manner, for example, by use of at leastone of the group of DSB-inducing agents consisting of: (a) a nucleasecapable of effecting site-specific alteration of a target nucleotidesequence, selected from the group consisting of an RNA-guided nuclease,an RNA-guided DNA endonuclease, a type II Cas nuclease, a Cas9, a type VCas nuclease, a Cpf1, a CasY, a CasX, a C2c1, a C2c3, an engineerednuclease, a codon-optimized nuclease, a zinc-finger nuclease (ZFN), atranscription activator-like effector nuclease (TAL-effector nuclease),an Argonaute, and a meganuclease or engineered meganuclease; (b) apolynucleotide encoding one or more nucleases capable of effectingsite-specific alteration (such as introduction of a DSB) of a targetnucleotide sequence; and (c) a guide RNA (gRNA) for an RNA-guidednuclease, or a DNA encoding a gRNA for an RNA-guided nuclease. Inembodiments, one or more DSBs is introduced into the genome by use ofboth a guide RNA (gRNA) and the corresponding RNA-guided nuclease. In anexample, one or more DSBs is introduced into the genome by use of aribonucleoprotein (RNP) that includes both a gRNA (e. g., a single-guideRNA or sgRNA that includes both a crRNA and a tracrRNA) and a Cas9. Itis generally desirable that the sequence encoded by the polynucleotidedonor molecule is integrated at the site of the DSB at high efficiency.One measure of efficiency is the percentage or fraction of thepopulation of cells that have been treated with a DSB-inducing agent andpolynucleotide donor molecule, and in which a sequence encoded by thepolynucleotide donor molecule is successfully introduced at the DSBcorrectly located in the genome. The efficiency of genome editingincluding integration of a sequence encoded by a polynucleotide donormolecule at a DSB in the genome is assessed by any suitable method suchas a heteroduplex cleavage assay or by sequencing, as describedelsewhere in this disclosure. In various embodiments, the DSB is inducedin the correct location in the genome at a comparatively highefficiency, e. g., at about 10, about 15, about 20, about 30, about 40,about 50, about 60, about 70, or about 80 percent efficiency, or atgreater than 80, 85, 90, or 95 percent efficiency (measured as thepercentage of the total population of cells in which the DSB is inducedat the correct location in the genome). In various embodiments, asequence encoded by the polynucleotide donor molecule is integrated atthe site of the DSB at a comparatively high efficiency, e. g., at about10, about 15, about 20, about 30, about 40, about 50, about 60, about70, or about 80 percent efficiency, or at greater than 80, 85, 90, or 95percent efficiency (measured as the percentage of the total populationof cells in which the polynucleotide molecule is integrated at the siteof the DSB in the correct location in the genome).

Apart from the CRISPR-type nucleases, other nucleases capable ofeffecting site-specific alteration of a target nucleotide sequenceinclude zinc-finger nucleases (ZFNs), transcription activator-likeeffector nucleases (TAL-effector nucleases or TALENs), Argonauteproteins, and a meganuclease or engineered meganuclease. Zinc fingernucleases (ZFNs) are engineered proteins comprising a zinc fingerDNA-binding domain fused to a nucleic acid cleavage domain, e. g., anuclease. The zinc finger binding domains provide specificity and can beengineered to specifically recognize any desired target DNA sequence.For a review of the construction and use of ZFNs in plants and otherorganisms, see, e. g., Urnov et al. (2010) Nature Rev. Genet.,11:636-646. The zinc finger DNA binding domains are derived from theDNA-binding domain of a large class of eukaryotic transcription factorscalled zinc finger proteins (ZFPs). The DNA-binding domain of ZFPstypically contains a tandem array of at least three zinc “fingers” eachrecognizing a specific triplet of DNA. A number of strategies can beused to design the binding specificity of the zinc finger bindingdomain. One approach, termed “modular assembly”, relies on thefunctional autonomy of individual zinc fingers with DNA. In thisapproach, a given sequence is targeted by identifying zinc fingers foreach component triplet in the sequence and linking them into amultifinger peptide. Several alternative strategies for designing zincfinger DNA binding domains have also been developed. These methods aredesigned to accommodate the ability of zinc fingers to contactneighboring fingers as well as nucleotides bases outside their targettriplet. Typically, the engineered zinc finger DNA binding domain has anovel binding specificity, compared to a naturally-occurring zinc fingerprotein. Modification methods include, for example, rational design andvarious types of selection. Rational design includes, for example, theuse of databases of triplet (or quadruplet) nucleotide sequences andindividual zinc finger amino acid sequences, in which each triplet orquadruplet nucleotide sequence is associated with one or more amino acidsequences of zinc fingers which bind the particular triplet orquadruplet sequence. See, e. g., U.S. Pat. Nos. 6,453,242 and 6,534,261,both incorporated herein by reference in their entirety. Exemplaryselection methods (e. g., phage display and yeast two-hybrid systems)are well known and described in the literature. In addition, enhancementof binding specificity for zinc finger binding domains has beendescribed in U.S. Pat. No. 6,794,136, incorporated herein by referencein its entirety. In addition, individual zinc finger domains may belinked together using any suitable linker sequences. Examples of linkersequences are publicly known, e. g., see U.S. Pat. Nos. 6,479,626;6,903,185; and 7,153,949, incorporated herein by reference in theirentirety. The nucleic acid cleavage domain is non-specific and istypically a restriction endonuclease, such as Fok1. This endonucleasemust dimerize to cleave DNA. Thus, cleavage by Fok1 as part of a ZFNrequires two adjacent and independent binding events, which must occurin both the correct orientation and with appropriate spacing to permitdimer formation. The requirement for two DNA binding events enables morespecific targeting of long and potentially unique recognition sites.Fok1 variants with enhanced activities have been described; see, e. g.,Guo et al. (2010) J. Mol. Biol., 400:96-107.

Transcription activator like effectors (TALEs) are proteins secreted bycertain Xanthomonas species to modulate gene expression in host plantsand to facilitate the colonization by and survival of the bacterium.TALEs act as transcription factors and modulate expression of resistancegenes in the plants. Recent studies of TALEs have revealed the codelinking the repetitive region of TALEs with their target DNA-bindingsites. TALEs comprise a highly conserved and repetitive regionconsisting of tandem repeats of mostly 33 or 34 amino acid segments. Therepeat monomers differ from each other mainly at amino acid positions 12and 13. A strong correlation between unique pairs of amino acids atpositions 12 and 13 and the corresponding nucleotide in the TALE-bindingsite has been found. The simple relationship between amino acid sequenceand DNA recognition of the TALE binding domain allows for the design ofDNA binding domains of any desired specificity. TALEs can be linked to anon-specific DNA cleavage domain to prepare genome editing proteins,referred to as TAL-effector nucleases or TALENs. As in the case of ZFNs,a restriction endonuclease, such as Fok1, can be conveniently used. Fora description of the use of TALENs in plants, see Mahfouz et al. (2011)Proc. Natl. Acad. Sci. USA, 108:2623-2628 and Mahfouz (2011) GM Crops,2:99-103.

Argonauts are proteins that can function as sequence-specificendonucleases by binding a polynucleotide (e. g., a single-stranded DNAor single-stranded RNA) that includes sequence complementary to a targetnucleotide sequence) that guides the Argonaut to the target nucleotidesequence and effects site-specific alteration of the target nucleotidesequence; see, e. g., U. S. Patent Application Publication 2015/0089681,incorporated herein by reference in its entirety.

Another method of effecting targeted changes to a genome is the use oftriple-forming peptide nucleic acids (PNAs) designed to bindsite-specifically to genomic DNA via strand invasion and the formationof PNA/DNA/PNA triplexes (via both Watson-Crick and Hoogsteen binding)with a displaced DNA strand. PNAs consist of a charge neutralpeptide-like backbone and nucleobases. The nucleobases hybridize to DNAwith high affinity. The triplexes then recruit the cell's endogenous DNArepair systems to initiate site-specific modification of the genome. Thedesired sequence modification is provided by single-stranded ‘donorDNAs’ which are co-delivered as templates for repair. See, e. g., BahalR et al (2016) Nature Communications, Oct. 26, 2016.

In related embodiments, zinc finger nucleases, TALENs, and Argonautesare used in conjunction with other functional domains. For example, thenuclease activity of these nucleic acid targeting systems can be alteredso that the enzyme binds to but does not cleave the DNA. Examples offunctional domains include transposase domains, integrase domains,recombinase domains, resolvase domains, invertase domains, proteasedomains, DNA methyltransferase domains, DNA hydroxylmethylase domains,DNA demethylase domains, histone acetylase domains, histone deacetylasedomains, nuclease domains, repressor domains, activator domains,nuclear-localization signal domains, transcription-regulatory protein(or transcription complex recruiting) domains, cellular uptake activityassociated domains, nucleic acid binding domains, antibody presentationdomains, histone modifying enzymes, recruiter of histone modifyingenzymes; inhibitor of histone modifying enzymes, histonemethyltransferases, histone demethylases, histone kinases, histonephosphatases, histone ribosylases, histone deribosylases, histoneubiquitinases, histone deubiquitinases, histone biotinases and histonetail proteases. Non-limiting examples of functional domains include atranscriptional activation domain, a transcription repression domain,and an SHH1, SUVH2, or SUVH9 polypeptide capable of reducing expressionof a target nucleotide sequence via epigenetic modification; see, e. g.,U. S. Patent Application Publication 2016/0017348, incorporated hereinby reference in its entirety. Genomic DNA may also be modified via baseediting using a fusion between a catalytically inactive Cas9 (dCas9) isfused to a cytidine deaminase which convert cytosine (C) to uridine (U),thereby effecting a C to T substitution; see Komor et al. (2016) Nature,533:420-424.

In embodiments, the guide RNA (gRNA) has a sequence of between 16-24nucleotides in length (e. g., 16, 17, 18, 19, 20, 21, 22, 23, or 24nucleotides in length). Specific embodiments include gRNAs of 19, 20, or21 nucleotides in length and having 100% complementarity to the targetnucleotide sequence. In many embodiments the gRNA has exactcomplementarity (i.e., perfect base-pairing) to the target nucleotidesequence; in certain other embodiments the gRNA has less than 100%complementarity to the target nucleotide sequence. The design ofeffective gRNAs for use in plant genome editing is disclosed in U. S.Patent Application Publication 2015/0082478 A1, the entire specificationof which is incorporated herein by reference. In embodiments wheremultiple gRNAs are employed, the multiple gRNAs can be deliveredseparately (as separate RNA molecules or encoded by separate DNAmolecules) or in combination, e. g., as an RNA molecule containingmultiple gRNA sequences, or as a DNA molecule encoding an RNA moleculecontaining multiple gRNA sequences; see, for example, U. S. PatentApplication Publication 2016/0264981 A1, the entire specification ofwhich is incorporated herein by reference, which discloses RNA moleculesincluding multiple RNA sequences (such as gRNA sequences) separated bytRNA cleavage sequences. In other embodiments, a DNA molecule encodesmultiple gRNAs which are separated by other types of cleavabletranscript, for example, small RNA (e. g., miRNA, siRNA, or ta-siRNA)recognition sites which can be cleaved by the corresponding small RNA,or dsRNA-forming regions which can be cleaved by a Dicer-typeribonuclease, or sequences which are recognized by RNA nucleases such asCys4 ribonuclease from Pseudomonas aeruginosa; see, e. g., U.S. Pat. No.7,816,581, the entire specification of which is incorporated herein byreference, which discloses in FIG. 27 and elsewhere in the specificationpol II promoter-driven DNA constructs encoding RNA transcripts that arereleased by cleavage. Efficient Cas9-mediated gene editing has beenachieved using a chimeric “single guide RNA” (“sgRNA”), an engineered(synthetic) single RNA molecule that mimics a naturally occurringcrRNA-tracrRNA complex and contains both a tracrRNA (for binding thenuclease) and at least one crRNA (to guide the nuclease to the sequencetargeted for editing). In other embodiments, self-cleaving ribozymesequences can be used to separate multiple gRNA sequences within atranscript.

Thus, in certain embodiments wherein the nuclease is a Cas9-typenuclease, the gRNA can be provided as a polynucleotide compositionincluding: (a) a CRISPR RNA (crRNA) that includes the gRNA together witha separate tracrRNA, or (b) at least one polynucleotide that encodes acrRNA and a tracrRNA (on a single polynucleotide or on separatepolynucleotides), or (c) at least one polynucleotide that is processedinto one or more crRNAs and a tracrRNA. In other embodiments wherein thenuclease is a Cas9-type nuclease, the gRNA can be provided as apolynucleotide composition including a CRISPR RNA (crRNA) that includesthe gRNA, and the required tracrRNA is provided in a separatecomposition or in a separate step, or is otherwise provided to the cell(for example, to a plant cell or plant protoplast that stably ortransiently expresses the tracrRNA from a polynucleotide encoding thetracrRNA). In other embodiments wherein the nuclease is a Cas9-typenuclease, the gRNA can be provided as a polynucleotide compositioncomprising: (a) a single guide RNA (sgRNA) that includes the gRNA, or(b) a polynucleotide that encodes a sgRNA, or (c) a polynucleotide thatis processed into a sgRNA. Cpf1-mediated gene editing does not require atracrRNA; thus, in embodiments wherein the nuclease is a Cpf1-typenuclease, the gRNA is provided as a polynucleotide compositioncomprising (a) a CRISPR RNA (crRNA) that includes the gRNA, or (b) apolynucleotide that encodes a crRNA, or (c) a polynucleotide that isprocessed into a crRNA. In embodiments, the gRNA-containing compositionoptionally includes an RNA-guided nuclease, or a polynucleotide thatencodes the RNA-guided nuclease. In other embodiments, an RNA-guidednuclease or a polynucleotide that encodes the RNA-guided nuclease isprovided in a separate step. In some embodiments of the method, a gRNAis provided to a cell (e. g., a plant cell or plant protoplast) thatincludes an RNA-guided nuclease or a polynucleotide that encodes anRNA-guided nuclease, e. g., an RNA-guided nuclease selected from thegroup consisting of an RNA-guided DNA endonuclease, a type II Casnuclease, a Cas9, a type V Cas nuclease, a Cpf1, a CasY, a CasX, a C2c1,a C2c3, an engineered RNA-guided nuclease, and a codon-optimizedRNA-guided nuclease; in an example, the cell (e. g., a plant cell orplant protoplast) stably or transiently expresses the RNA-guidednuclease. In embodiments, the polynucleotide that encodes the RNA-guidednuclease is, for example, DNA that encodes the RNA-guided nuclease andis stably integrated in the genome of a plant cell or plant protoplast,DNA or RNA that encodes the RNA-guided nuclease and is transientlypresent in or introduced into a plant cell or plant protoplast; such DNAor RNA can be introduced, e. g., by using a vector such as a plasmid orviral vector or as an mRNA, or as vector-less DNA or RNA introduceddirectly into a plant cell or plant protoplast.

In embodiments that further include the step of providing to a cell (e.g., a plant cell or plant protoplast) an RNA-guided nuclease or apolynucleotide that encodes the RNA-guided nuclease, the RNA-guidednuclease is provided simultaneously with the gRNA-containingcomposition, or in a separate step that precedes or follows the step ofproviding the gRNA-containing composition. In embodiments, thegRNA-containing composition further includes an RNA-guided nuclease or apolynucleotide that encodes the RNA-guided nuclease. In otherembodiments, there is provided a separate composition that includes anRNA-guided nuclease or a polynucleotide that encodes the RNA-guidednuclease. In embodiments, the RNA-guided nuclease is provided as aribonucleoprotein (RNP) complex, e. g., a preassembled RNP that includesthe RNA-guided nuclease complexed with a polynucleotide including thegRNA or encoding a gRNA, or a preassembled RNP that includes apolynucleotide that encodes the RNA-guided nuclease (and optionallyencodes the gRNA, or is provided with a separate polynucleotideincluding the gRNA or encoding a gRNA), complexed with a protein. Inembodiments, the RNA-guided nuclease is a fusion protein, i. e., whereinthe RNA-guided nuclease (e. g., Cas9, Cpf1, CasY, CasX, C2c1, or C2c3)is covalently bound through a peptide bond to a cell-penetratingpeptide, a nuclear localization signal peptide, a chloroplast transitpeptide, or a mitochondrial targeting peptide; such fusion proteins areconveniently encoded in a single nucleotide sequence, optionallyincluding codons for linking amino acids. In embodiments, the RNA-guidednuclease or a polynucleotide that encodes the RNA-guided nuclease isprovided as a complex with a cell-penetrating peptide or othertransfecting agent. In embodiments, the RNA-guided nuclease or apolynucleotide that encodes the RNA-guided nuclease is complexed with,or covalently or non-covalently bound to, a further element, e. g., acarrier molecule, an antibody, an antigen, a viral movement protein, apolymer, a detectable label (e. g., a moiety detectable by fluorescence,radioactivity, or enzymatic or immunochemical reaction), a quantum dot,or a particulate or nanoparticulate. In embodiments, the RNA-guidednuclease or a polynucleotide that encodes the RNA-guided nuclease isprovided in a solution, or is provided in a liposome, micelle, emulsion,reverse emulsion, suspension, or other mixed-phase composition.

An RNA-guided nuclease can be provided to a cell (e. g., a plant cell orplant protoplast) by any suitable technique. In embodiments, theRNA-guided nuclease is provided by directly contacting a plant cell orplant protoplast with the RNA-guided nuclease or the polynucleotide thatencodes the RNA-guided nuclease. In embodiments, the RNA-guided nucleaseis provided by transporting the RNA-guided nuclease or a polynucleotidethat encodes the RNA-guided nuclease into a plant cell or plantprotoplast using a chemical, enzymatic, or physical agent as provided indetail in the paragraphs following the heading “Delivery Methods andDelivery Agents”. In embodiments, the RNA-guided nuclease is provided bybacterially mediated (e. g., Agrobacterium sp., Rhizobium sp.,Sinorhizobium sp., Mesorhizobium sp., Bradyrhizobium sp., Azobacter sp.,Phyllobacterium sp.) transfection of a plant cell or plant protoplastwith a polynucleotide encoding the RNA-guided nuclease; see, e. g.,Broothaerts et al. (2005) Nature, 433:629-633. In an embodiment, theRNA-guided nuclease is provided by transcription in a plant cell orplant protoplast of a DNA that encodes the RNA-guided nuclease and isstably integrated in the genome of the plant cell or plant protoplast orthat is provided to the plant cell or plant protoplast in the form of aplasmid or expression vector (e. g., a viral vector) that encodes theRNA-guided nuclease (and optionally encodes one or more gRNAs, crRNAs,or sgRNAs, or is optionally provided with a separate plasmid or vectorthat encodes one or more gRNAs, crRNAs, or sgRNAs). In embodiments, theRNA-guided nuclease is provided to the plant cell or plant protoplast asa polynucleotide that encodes the RNA-guided nuclease, e. g., in theform of an mRNA encoding the nuclease.

Where a polynucleotide is concerned (e. g., a crRNA that includes thegRNA together with a separate tracrRNA, or a crRNA and a tracrRNAencoded on a single polynucleotide or on separate polynucleotides, or atleast one polynucleotide that is processed into one or more crRNAs and atracrRNA, or a sgRNA that includes the gRNA, or a polynucleotide thatencodes a sgRNA, or a polynucleotide that is processed into a sgRNA, ora polynucleotide that encodes the RNA-guided nuclease), embodiments ofthe polynucleotide include: (a) double-stranded RNA; (b) single-strandedRNA; (c) chemically modified RNA; (d) double-stranded DNA; (e)single-stranded DNA; (f) chemically modified DNA; or (g) a combinationof (a)-(f). Where expression of a polynucleotide is involved (e. g.,expression of a crRNA from a DNA encoding the crRNA, or expression andtranslation of a RNA-guided nuclease from a DNA encoding the nuclease),in some embodiments it is sufficient that expression be transient, i.e., not necessarily permanent or stable in the cell. Certain embodimentsof the polynucleotide further include additional nucleotide sequencesthat provide useful functionality; non-limiting examples of suchadditional nucleotide sequences include an aptamer or riboswitchsequence, nucleotide sequence that provides secondary structure such asstem-loops or that provides a sequence-specific site for an enzyme (e.g., a sequence-specific recombinase or endonuclease site), T-DNA (e. g.,DNA sequence encoding a gRNA, crRNA, tracrRNA, or sgRNA is enclosedbetween left and right T-DNA borders from Agrobacterium spp. or fromother bacteria that infect or induce tumours in plants), a DNAnuclear-targeting sequence, a regulatory sequence such as a promotersequence, and a transcript-stabilizing sequence. Certain embodiments ofthe polynucleotide include those wherein the polynucleotide is complexedwith, or covalently or non-covalently bound to, a non-nucleic acidelement, e. g., a carrier molecule, an antibody, an antigen, a viralmovement protein, a cell-penetrating or pore-forming peptide, a polymer,a detectable label, a quantum dot, or a particulate or nanoparticulate.

In embodiments, the at least one DSB is introduced into the genome by atleast one treatment selected from the group consisting of: (a)bacterially mediated (e. g., Agrobacterium sp., Rhizobium sp.,Sinorhizobium sp., Mesorhizobium sp., Bradyrhizobium sp., Azobacter sp.,Phyllobacterium sp.) transfection with a DSB-inducing agent; (b)Biolistics or particle bombardment with a DSB-inducing agent; (c)treatment with at least one chemical, enzymatic, or physical agent asprovided in detail in the paragraphs following the heading “DeliveryMethods and Delivery Agents”; and (d) application of heat or cold,ultrasonication, centrifugation, positive or negative pressure, cellwall or membrane disruption or deformation, or electroporation. It isgenerally desirable that introduction of the at least one DSB into thegenome (i. e., the “editing” of the genome) is achieved with sufficientefficiency and accuracy to ensure practical utility. One measure ofefficiency is the percentage or fraction of the population of cells thathave been treated with a DSB-inducing agent and in which the DSB issuccessfully introduced at the correct site in the genome. Theefficiency of genome editing is assessed by any suitable method such asa heteroduplex cleavage assay or by sequencing, as described elsewherein this disclosure. Accuracy is indicated by the absence of, or minimaloccurrence of, off-target introduction of a DSB (i. e., at other thanthe intended site in the genome).

The location where the at least one DSB is inserted varies according tothe desired result, for example whether the intention is to simplydisrupt expression of the sequence of interest, or to add functionality(such as placing expression of the sequence of interest under induciblecontrol). Thus, the location of the DSB is not necessarily within ordirectly adjacent to the sequence of interest. In embodiments, the atleast one DSB in a genome is located: (a) within the sequence ofinterest, (b) upstream of (i. e., 5′ to) the sequence of interest, or(c) downstream of (i. e., 3′ to) the sequence of interest. Inembodiments, a sequence encoded by the polynucleotide (such as adouble-stranded DNA, a single-stranded DNA, a single-stranded DNA/RNAhybrid, or a double-stranded DNA/RNA hybrid) donor molecule, whenintegrated into the genome, is functionally or operably linked (e. g.,linked in a manner that modifies the transcription or the translation ofthe sequence of interest or that modifies the stability of a transcriptincluding that of the sequence of interest) to the sequence of interest.In embodiments, a sequence encoded by the polynucleotide donor moleculeis integrated at a location 5′ to and operably linked to the sequence ofinterest, wherein the integration location is selected to provide aspecifically modulated (upregulated or downregulated) level ofexpression of the sequence of interest. For example, a sequence encodedby the polynucleotide donor molecule is integrated at a specificlocation in the promoter region of a protein-encoding gene that resultsin a desired expression level of the protein; in an embodiment, theappropriate location is determined empirically by integrating a sequenceencoded by the polynucleotide donor molecule at about 50, about 100,about 150, about 200, about 250, about 300, about 350, about 400, about450, and about 500 nucleotides 5′ to (upstream of) the start codon ofthe coding sequence, and observing the relative expression levels of theprotein for each integration location.

In embodiments, the donor polynucleotide sequence of interest includescoding (protein-coding) sequence, non-coding (non-protein-coding)sequence, or a combination of coding and non-coding sequence.Embodiments include a plant nuclear sequence, a plant plastid sequence,a plant mitochondrial sequence, a sequence of a symbiont, pest, orpathogen of a plant, and combinations thereof. Embodiments includeexons, introns, regulatory sequences including promoters, other 5′elements and 3′ elements, and genomic loci encoding non-coding RNAsincluding long non-coding RNAs (lncRNAs), microRNAs (miRNAs), andtrans-acting siRNAs (ta-siRNAs). In embodiments, multiple sequences arealtered, for example, by delivery of multiple gRNAs to the plant cell orplant protoplast; the multiple sequences can be part of the same gene(e. g., different locations in a single coding region or in differentexons or introns of a protein-coding gene) or different genes. Inembodiments, the sequence of an endogenous genomic locus is altered todelete, add, or modify a functional non-coding sequence; in non-limitingexamples, such functional non-coding sequences include, e. g., a miRNA,siRNA, or to-siRNA recognition or cleavage site, a splice site, arecombinase recognition site, a transcription factor binding site, or atranscriptional or translational enhancer or repressor sequence.

In embodiments, the invention provides a method of changing expressionof a sequence of interest in a genome, including integrating a sequenceencoded by a polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule at the site of two ormore DSBs in a genome. In embodiments, the sequence of thepolynucleotide donor molecule that is integrated into each of the two ormore DSBs is (a) identical, or (b) different, for each of the DSBs. Inembodiments, the change in expression of a sequence of interest ingenome is manifested as the expression of an altered or edited sequenceof interest; in non-limiting examples, the method is used to integratesequence-specific recombinase recognition site sequences at two DSBs ina genome, whereby, in the presence of the corresponding site-specificDNA recombinase, the genomic sequence flanked on either side by theintegrated recombinase recognition sites is excised from the genome (orin some instances is inverted); such an approach is useful, e. g., fordeletion of larger lengths of genomic sequence, for example, deletion ofall or part of an exon or of one or more protein domains. In otherembodiments, at least two DSBs are introduced into a genome by one ormore nucleases in such a way that genomic sequence is deleted betweenthe DSBs (leaving a deletion with blunt ends, overhangs or a combinationof a blunt end and an overhang), and a sequence encoded by at least onepolynucleotide donor molecule is integrated between the DSBs (i. e., asequence encoded by at least one individual polynucleotide donormolecule is integrated at the location of the deleted genomic sequence),wherein the genomic sequence that is deleted is coding sequence,non-coding sequence, or a combination of coding and non-coding sequence;such embodiments provide the advantage of not requiring a specific PAMsite at or very near the location of a region wherein a nucleotidesequence change is desired. In an embodiment, at least two DSBs areintroduced into a genome by one or more nucleases in such a way thatgenomic sequence is deleted between the DSBs (leaving a deletion withblunt ends, overhangs or a combination of a blunt end and an overhang),and at least one sequence encoded by a polynucleotide donor molecule isintegrated between the DSBs (i. e., at least one individual sequenceencoded by a polynucleotide donor molecule is integrated at the locationof the deleted genomic sequence). In an embodiment, two DSBs areintroduced into a genome, resulting in excision or deletion of genomicsequence between the sites of the two DSBs, and a sequence encoded by apolynucleotide donor molecule integrated into the genome at the locationof the deleted genomic sequence (that is, a sequence encoded by anindividual polynucleotide donor molecule is integrated between the twoDSBs). Generally, the polynucleotide donor molecule with the sequence tobe integrated into the genome is selected in terms of the presence orabsence of terminal overhangs to match the type of DSBs introduced. Inan embodiment, two blunt-ended DSBs are introduced into a genome,resulting in excision or deletion of genomic sequence between the sitesof the two blunt-ended DSBs, and a sequence encoded by a blunt-endeddouble-stranded DNA or blunt-ended double-stranded DNA/RNA hybrid or asingle-stranded DNA or a single-stranded DNA/RNA hybrid donor moleculeis integrated into the genome between the two blunt-ended DSBs. Inanother embodiment, two DSBs are introduced into a genome, wherein thefirst DSB is blunt-ended and the second DSB has an overhang, resultingin deletion of genomic sequence between the two DSBs, and a sequenceencoded by a double-stranded DNA or double-stranded DNA/RNA hybrid donormolecule that is blunt-ended at one terminus and that has an overhang onthe other terminus (or, alternatively, a single-stranded DNA or asingle-stranded DNA/RNA hybrid molecule) is integrated into the genomebetween the two DSBs; in an alternative embodiment, two DSBs areintroduced into a genome, wherein both DSBs have overhangs but ofdifferent overhang lengths (different number of unpaired nucleotides),resulting in deletion of genomic sequence between the two DSBs, and asequence encoded by a double-stranded DNA or double-stranded DNA/RNAhybrid donor molecule that has overhangs at each terminus, wherein theoverhangs are of unequal lengths (or, alternatively, a single-strandedDNA or a single-stranded DNA/RNA hybrid donor molecule), is integratedinto the genome between the two DSBs; embodiments with such DSBasymmetry (i. e., a combination of DSBs having a blunt end and anoverhang, or a combination of DSBs having overhangs of unequal lengths)provide the opportunity for controlling directionality or orientation ofthe inserted polynucleotide, e. g., by selecting a double-stranded DNAor double-stranded DNA/RNA hybrid donor molecule having one blunt endand one terminus with unpaired nucleotides, such that the polynucleotideis integrated preferably in one orientations. In another embodiment, twoDSBs, each having an overhang, are introduced into a genome, resultingin excision or deletion of genomic sequence between the sites of the twoDSBs, and a sequence encoded by a double-stranded DNA or double-strandedDNA/RNA hybrid donor molecule that has an overhang at each terminus (or,alternatively, a single-stranded DNA or a single-stranded DNA/RNA hybriddonor molecule) is integrated into the genome between the two DSBs. Thelength of genomic sequence that is deleted between two DSBs and thelength of a sequence encoded by the polynucleotide donor molecule thatis integrated in place of the deleted genomic sequence can be, but neednot be equal. In embodiments, the distance between any two DSBs (or thelength of the genomic sequence that is to be deleted) is at least 10, atleast 15, at least 20, at least 25, at least 30, at least 40, at least50, at least 60, at least 70, at least 80, or at least 100 nucleotides;in other embodiments the distance between any two DSBs (or the length ofthe genomic sequence that is to be deleted) is at least 100, at least150, at least 200, at least 300, at least 400, at least 500, at least600, at least 750, or at least 1000 nucleotides. In embodiments wheremore than two DSBs are introduced into genomic sequence, it is possibleto effect different deletions of genomic sequence (for example, wherethree DSBs are introduced, genomic sequence can be deleted between thefirst and second DSBs, between the first and third DSBs, and between thesecond and third DSBs). In some embodiments, a sequence encoded by morethan one polynucleotide donor molecule (e. g., multiple copies of asequence encoded by a polynucleotide donor molecule having a givensequence, or multiple sequences encoded by polynucleotide donormolecules with two or more different sequences) is integrated into thegenome. For example, different sequences encoded by individualpolynucleotide donor molecules can be individually integrated at asingle locus where genomic sequence has been deleted between two DSBs,or at multiple locations where genomic sequence has been deleted (e. g.,where more than two DSBs have been introduced into the genome). Inembodiments, at least one exon is replaced by integrating a sequenceencoded by at least one polynucleotide molecule where genomic sequenceis deleted between DSBs that were introduced by at least onesequence-specific nuclease into intronic sequence flanking the at leastone exon; an advantage of this approach over an otherwise similar method(i. e., differing by having the DSBs introduced into coding sequenceinstead of intronic sequence) is the avoidance of inaccuracies(nucleotide changes, deletions, or additions at the nuclease cleavagesites) in the resulting exon sequence or messenger RNA.

In embodiments, the methods described herein are used to delete orreplace genomic sequence, which can be a relatively large sequence (e.g., all or part of at least one exon or of a protein domain) resultingin the equivalent of an alternatively spliced transcript. Additionalrelated aspects include compositions and reaction mixtures including aplant cell or a plant protoplast and at least two guide RNAs, whereineach guide RNA is designed to effect a DSB in intronic sequence flankingat least one exon; such compositions and reaction mixtures optionallyinclude at least one sequence-specific nuclease capable of being guidedby at least one of the guide RNAs to effect a DSB in genomic sequence,and optionally include a polynucleotide donor molecule that is capableof being integrated (or having its sequence integrated) into the genomeat the location of at least one DSB or at the location of genomicsequence that is deleted between the DSBs.

Donor Polynucleotide Molecules: Embodiments of the polynucleotide donormolecule having a sequence that is integrated at the site of at leastone double-strand break (DSB) in a genome include double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, and adouble-stranded DNA/RNA hybrid. In embodiments, a polynucleotide donormolecule that is a double-stranded (e. g., a dsDNA or dsDNA/RNA hybrid)molecule is provided directly to the plant protoplast or plant cell inthe form of a double-stranded DNA or a double-stranded DNA/RNA hybrid,or as two single-stranded DNA (ssDNA) molecules that are capable ofhybridizing to form dsDNA, or as a single-stranded DNA molecule and asingle-stranded RNA (ssRNA) molecule that are capable of hybridizing toform a double-stranded DNA/RNA hybrid; that is to say, thedouble-stranded polynucleotide molecule is not provided indirectly, forexample, by expression in the cell of a dsDNA encoded by a plasmid orother vector. In various non-limiting embodiments of the method, thepolynucleotide donor molecule that is integrated (or that has a sequencethat is integrated) at the site of at least one double-strand break(DSB) in a genome is double-stranded and blunt-ended; in otherembodiments the polynucleotide donor molecule is double-stranded and hasan overhang or “sticky end” consisting of unpaired nucleotides (e. g.,1, 2, 3, 4, 5, or 6 unpaired nucleotides) at one terminus or bothtermini. In an embodiment, the DSB in the genome has no unpairednucleotides at the cleavage site, and the polynucleotide donor moleculethat is integrated (or that has a sequence that is integrated) at thesite of the DSB is a blunt-ended double-stranded DNA or blunt-endeddouble-stranded DNA/RNA hybrid molecule, or alternatively is asingle-stranded DNA or a single-stranded DNA/RNA hybrid molecule. Inanother embodiment, the DSB in the genome has one or more unpairednucleotides at one or both sides of the cleavage site, and thepolynucleotide donor molecule that is integrated (or that has a sequencethat is integrated) at the site of the DSB is a double-stranded DNA ordouble-stranded DNA/RNA hybrid molecule with an overhang or “sticky end”consisting of unpaired nucleotides at one or both termini, oralternatively is a single-stranded DNA or a single-stranded DNA/RNAhybrid molecule; in embodiments, the polynucleotide donor molecule DSBis a double-stranded DNA or double-stranded DNA/RNA hybrid molecule thatincludes an overhang at one or at both termini, wherein the overhangconsists of the same number of unpaired nucleotides as the number ofunpaired nucleotides created at the site of a DSB by a nuclease thatcuts in an off-set fashion (e. g., where a Cpf1 nuclease effects anoff-set DSB with 5-nucleotide overhangs in the genomic sequence, thepolynucleotide donor molecule that is to be integrated (or that has asequence that is to be integrated) at the site of the DSB isdouble-stranded and has 5 unpaired nucleotides at one or both termini).Generally, one or both termini of the polynucleotide donor moleculecontain no regions of sequence homology (identity or complementarity) togenomic regions flanking the DSB; that is to say, one or both termini ofthe polynucleotide donor molecule contain no regions of sequence that issufficiently complementary to permit hybridization to genomic regionsimmediately adjacent to the location of the DSB. In embodiments, thepolynucleotide donor molecule contains no homology to the locus of theDSB, that is to say, the polynucleotide donor molecule contains nonucleotide sequence that is sufficiently complementary to permithybridization to genomic regions immediately adjacent to the location ofthe DSB. In an embodiment, the polynucleotide donor molecule that isintegrated at the site of at least one double-strand break (DSB)includes between 2-20 nucleotides in one (if single-stranded) or in bothstrands (if double-stranded), e. g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,13, 14, 15, 16, 17, 18, 19, or 20 nucleotides on one or on both strands,each of which can be base-paired to a nucleotide on the opposite strand(in the case of a perfectly base-paired double-stranded polynucleotidemolecule). In embodiments, the polynucleotide donor molecule is at leastpartially double-stranded and includes 2-20 base-pairs, e. g., 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 base-pairs;in embodiments, the polynucleotide donor molecule is double-stranded andblunt-ended and consists of 2-20 base-pairs, e. g., 2, 3, 4, 5, 6, 7, 8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 base-pairs; in otherembodiments, the polynucleotide donor molecule is double-stranded andincludes 2-20 base-pairs, e. g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,14, 15, 16, 17, 18, 19, or 20 base-pairs and in addition has at leastone overhang or “sticky end” consisting of at least one additional,unpaired nucleotide at one or at both termini. Non-limiting examples ofsuch relatively small polynucleotide donor molecules of 20 or fewerbase-pairs (if double-stranded) or 20 or fewer nucleotides (ifsingle-stranded) include polynucleotide donor molecules that have atleast one strand including a transcription factor recognition sitesequence (e. g., such as the sequences of transcription factorrecognition sites provided in the working Examples), or that have atleast one strand including a small RNA recognition site, or that have atleast one strand including a recombinase recognition site. In anembodiment, the polynucleotide donor molecule that is integrated (orthat has a sequence that is integrated) at the site of at least onedouble-strand break (DSB) in a genome is a blunt-ended double-strandedDNA or a blunt-ended double-stranded DNA/RNA hybrid molecule of about 18to about 300 base-pairs, or about 20 to about 200 base-pairs, or about30 to about 100 base-pairs, and having at least one phosphorothioatebond between adjacent nucleotides at a 5′ end, 3′ end, or both 5′ and 3′ends. In embodiments, the polynucleotide donor molecule includes singlestrands of at least 11, at least 18, at least 20, at least 30, at least40, at least 60, at least 80, at least 100, at least 120, at least 140,at least 160, at least 180, at least 200, at least 240, at about 280, orat least 320 nucleotides. In embodiments, the polynucleotide donormolecule has a length of at least 2, at least 3, at least 4, at least 5,at least 6, at least 7, at least 8, at least 9, at least 10, or at least11 base-pairs if double-stranded (or nucleotides if single-stranded), orbetween about 2 to about 320 base-pairs if double-stranded (ornucleotides if single-stranded), or between about 2 to about 500base-pairs if double-stranded (or nucleotides if single-stranded), orbetween about 5 to about 500 base-pairs if double-stranded (ornucleotides if single-stranded), or between about 5 to about 300base-pairs if double-stranded (or nucleotides if single-stranded), orbetween about 11 to about 300 base-pairs if double-stranded (ornucleotides if single-stranded), or about 18 to about 300 base-pairs ifdouble-stranded (or nucleotides if single-stranded), or between about 30to about 100 base-pairs if double-stranded (or nucleotides ifsingle-stranded). In embodiments, the polynucleotide donor moleculeincludes chemically modified nucleotides (see, e. g., the variousmodifications of internucleotide linkages, bases, and sugars describedin Verma and Eckstein (1998) Annu. Rev. Biochem., 67:99-134); inembodiments, the naturally occurring phosphodiester backbone of thepolynucleotide donor molecule is partially or completely modified withphosphorothioate, phosphorodithioate, or methylphosphonateinternucleotide linkage modifications, or the polynucleotide donormolecule includes modified nucleoside bases or modified sugars, or thepolynucleotide donor molecule is labelled with a fluorescent moiety (e.g., fluorescein or rhodamine or a fluorescent nucleoside analogue) orother detectable label (e. g., biotin or an isotope). In an embodiment,the polynucleotide donor molecule is double-stranded and perfectlybase-paired through all or most of its length, with the possibleexception of any unpaired nucleotides at either terminus or bothtermini. In another embodiment, the polynucleotide donor molecule isdouble-stranded and includes one or more non-terminal mismatches ornon-terminal unpaired nucleotides within the otherwise double-strandedduplex. In another embodiment, the polynucleotide donor moleculecontains secondary structure that provides stability or acts as anaptamer. Other related embodiments include double-stranded DNA/RNAhybrid molecules, single-stranded DNA/RNA hybrid donor molecules, andsingle-stranded DNA donor molecules (including single-stranded,chemically modified DNA donor molecules), which in analogous proceduresare integrated (or have a sequence that is integrated) at the site of adouble-strand break.

In embodiments of the method, the polynucleotide (such as adouble-stranded DNA, a single-stranded DNA, a single-stranded DNA/RNAhybrid, or a double-stranded DNA/RNA hybrid) donor molecule that isintegrated at the site of at least one double-strand break (DSB) in agenome includes nucleotide sequence(s) on one or on both strands thatprovide a desired functionality when the polynucleotide is integratedinto the genome. In various non-limiting embodiments of the method, thesequence encoded by a donor polynucleotide that is inserted at the siteof at least one double-strand break (DSB) in a genome includes at leastone sequence selected from the group consisting of:

-   -   (a) DNA encoding at least one stop codon, or at least one stop        codon on each strand, or at least one stop codon within each        reading frame on each strand;    -   (b) DNA encoding heterologous primer sequence (e. g., a sequence        of about 18 to about 22 contiguous nucleotides, or of at least        18 contiguous nucleotides, that can be used to initiate DNA        polymerase activity at the site of the DSB);    -   (c) DNA encoding a unique identifier sequence (e. g., a sequence        that when inserted at the DSB creates a heterologous sequence        that can be used to identify the presence of the insertion);    -   (d) DNA encoding a transcript-stabilizing sequence;    -   (e) DNA encoding a transcript-destabilizing sequence;    -   (f) a DNA aptamer or DNA encoding an RNA aptamer or amino acid        aptamer; and    -   (g) DNA that includes or encodes a sequence recognizable by a        specific binding agent.

In an embodiment, the sequence encoded by the polynucleotide (such as adouble-stranded DNA, a single-stranded DNA, a single-stranded DNA/RNAhybrid, or a double-stranded DNA/RNA hybrid) donor molecule that isintegrated at the site of at least one double-strand break (DSB) in agenome includes DNA encoding at least one stop codon, or at least onestop codon on each strand, or at least one stop codon within eachreading frame on each strand. Such sequence encoded by a polynucleotidedonor molecule, when integrated at a DSB in a genome can be useful fordisrupting the expression of a sequence of interest, such as aprotein-coding gene. An example of such a polynucleotide donor moleculeis a double-stranded DNA, a single-stranded DNA, a single-strandedDNA/RNA hybrid, or a double-stranded DNA/RNA hybrid donor molecule, ofat least 18 contiguous base-pairs if double-stranded or at least 11contiguous nucleotides if single-stranded, and encoding at least onestop codon in each possible reading frame on either strand. Anotherexample of such a polynucleotide donor molecule is a double-stranded DNAor double-stranded DNA/RNA hybrid donor molecule wherein each strandincludes at least 18 and fewer than 200 contiguous base-pairs, whereinthe number of base-pairs is not divisible by 3, and wherein each strandencodes at least one stop codon in each possible reading frame in the 5′to 3′ direction. Another example of such a polynucleotide donor moleculeis a single-stranded DNA or single-stranded DNA/RNA hybrid donormolecule wherein each strand includes at least 11 and fewer than about300 contiguous nucleotides, wherein the number of base-pairs is notdivisible by 3, and wherein the polynucleotide donor molecule encodes atleast one stop codon in each possible reading frame in the 5′ to 3′direction.

In an embodiment, the polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is integrated (orthat has a sequence that is integrated) at the site of at least onedouble-strand break (DSB) in a genome includes DNA encoding heterologousprimer sequence (e. g., a sequence of about 18 to about 22 contiguousnucleotides, or of at least 18, at least 20, or at least 22 contiguousnucleotides that can be used to initiate DNA polymerase activity at thesite of the DSB). Heterologous primer sequence can further includenucleotides of the genomic sequence directly flanking the site of theDSB.

In an embodiment, the polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is integrated (orthat has a sequence that is integrated) at the site of at least onedouble-strand break (DSB) in a genome includes nucleotides encoding aunique identifier sequence (e. g., a sequence that when inserted at theDSB creates a heterologous sequence that can be used to identify thepresence of the insertion)

In an embodiment, the polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is integrated (orthat has a sequence that is integrated) at the site of at least onedouble-strand break (DSB) in a genome includes nucleotides encoding atranscript-stabilizing sequence. In an example, sequence of adouble-stranded or single-stranded DNA or a DNA/RNA hybrid donormolecule encoding a 5′ terminal RNA-stabilizing stem-loop (see, e. g.,Suay (2005) Nucleic Acids Rev., 33:4754-4761) is integrated at a DSBlocated 5′ to the sequence for which improved transcript stability isdesired. In another embodiment, the polynucleotide donor molecule thatis integrated (or that has a sequence that is integrated) at the site ofat least one double-strand break (DSB) in a genome includes nucleotidesencoding a transcript-destabilizing sequence such as the SAURdestabilizing sequences described in detail in U. S. Patent ApplicationPublication 2007/0011761, incorporated herein by reference.

In an embodiment, the polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is integrated (orthat has a sequence that is integrated) at the site of at least onedouble-strand break (DSB) in a genome includes a DNA aptamer or DNAencoding an RNA aptamer or amino acid aptamer. Nucleic acid (DNA or RNA)aptamers are single- or double-stranded nucleotides that bindspecifically to molecules or ligands which include small molecules (e.g., secondary metabolites such as alkaloids, terpenes, flavonoids, andother small molecules, as well as larger molecules such as polyketidesand non-ribosomal proteins), proteins, other nucleic acid molecules, andinorganic compounds. Introducing an aptamer at a specific location inthe genome is useful, e. g., for adding binding specificity to an enzymeor for placing expression of a transcript or activity of an encodedprotein under ligand-specific control. In an example, the polynucleotidedonor molecule encodes a poly-histidine “tag” which is integrated at aDSB downstream of a protein or protein subunit, enabling the proteinexpressed from the resulting transcript to be purified by affinity tonickel, e. g., on nickel resins; in an embodiments, the polynucleotidedonor molecule encodes a 6×-His tag, a 10×-His tag, or a 10×-His tagincluding one or more stop codons following the histidine-encodingcodons, where the last is particularly useful when integrated downstreamof a protein or protein subunit lacking a stop codon (see, e. g.,parts[dot]igem[dot]org/Part:BBa_K844000). In embodiments, thepolynucleotide donor molecule encodes a riboswitch, wherein theriboswitch includes both an aptamer which changes its conformation inthe presence or absence of a specific ligand, and anexpression-controlling region that turns expression on or off, dependingon the conformation of the aptamer. See, for example, the regulatory RNAmolecules containing ligand-specific aptamers described in U. S. PatentApplication Publication 2013/0102651 and the various riboswitchesdescribed in U. S. Patent Application Publication 2005/0053951, both ofwhich publications are incorporated herein by reference.

In an embodiment, the polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is integrated (orthat has a sequence that is integrated) at the site of at least onedouble-strand break (DSB) in a genome includes nucleotides that includeor encode a sequence recognizable by (i. e., binds to) a specificbinding agent. Non-limiting embodiments of specific binding agentsinclude nucleic acids, peptides or proteins, non-peptide/non-nucleicacid ligands, inorganic molecules, and combinations thereof; specificbinding agents also include macromolecular assemblages such as lipidbilayers, cell components or organelles, and even intact cells ororganisms. In embodiments, the specific binding agent is an aptamer orriboswitch, or alternatively is recognized by an aptamer or ariboswitch. In an embodiment, the invention provides a method ofchanging expression of a sequence of interest in a genome, comprisingintegrating a polynucleotide molecule at the site of a DSB in a genome,wherein the polynucleotide donor molecule includes a sequencerecognizable by a specific binding agent, wherein the integratedsequence encoded by the polynucleotide donor molecule is functionally oroperably linked to a sequence of interest, and wherein contacting theintegrated sequence encoded by the polynucleotide donor molecule withthe specific binding agent results in a change of expression of thesequence of interest; in embodiments, sequences encoded by differentpolynucleotide donor molecules are integrated at multiple DSBs in agenome.

In an embodiment, the polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is integrated (orthat has a sequence that is integrated) at the site of at least onedouble-strand break (DSB) in a genome includes nucleotides that includeor encode a sequence recognizable by (i. e., binds to) a specificbinding agent, wherein:

-   -   (a) the sequence recognizable by a specific binding agent        includes an auxin response element (AuxRE) sequence, the        specific binding agent is an auxin, and the change of expression        is upregulation; see, e. g., Walker and Estelle (1998) Curr.        Opinion Plant Biol., 1:434-439;    -   (b) the sequence recognizable by a specific binding agent        includes at least one D1-4 sequence (CCTCGTGTCTC, SEQ ID NO:328;        see Ulmasov et al. (1997) Plant Cell, 9:1963-1971), the specific        binding agent is an auxin, and the change of expression is        upregulation;    -   (c) the sequence recognizable by a specific binding agent        includes at least one DR5 sequence (CCTTTTGTCTC, SEQ ID NO:329;        see Ulmasov et al. (1997) Plant Cell, 9:1963-1971), the specific        binding agent is an auxin, and the change of expression is        upregulation;    -   (d) the sequence recognizable by a specific binding agent        includes at least one m5-DR5 sequence (CCTTTTGTCNC, wherein N is        A, C, or G, SEQ ID NO:330; see Ulmasov et al. (1997) Plant Cell,        9:1963-1971), the specific binding agent is an auxin, and the        change of expression is upregulation;    -   (e) the sequence recognizable by a specific binding agent        includes at least one P3 sequence (TGTCTC, SEQ ID NO:331), the        specific binding agent is an auxin, and the change of expression        is upregulation;    -   (f) the sequence recognizable by a specific binding agent        includes a small RNA recognition site sequence, the specific        binding agent is the corresponding small RNA (e. g., an siRNA, a        microRNA (miRNA), a trans-acting siRNA as described in U.S. Pat.        No. 8,030,473, or a phased sRNA as described in U.S. Pat. No.        8,404,928; both of these cited patents are incorporated by        reference herein), and the change of expression is        downregulation (non-limiting examples are given below, under the        heading “Small RNAs”);    -   (g) the sequence recognizable by a specific binding agent        includes a microRNA (miRNA) recognition site sequence, the        specific binding agent is the corresponding mature miRNA, and        the change of expression is downregulation (non-limiting        examples are given below, under the heading “Small RNAs”);    -   (h) the sequence recognizable by a specific binding agent        includes a microRNA (miRNA) recognition sequence for an        engineered miRNA, the specific binding agent is the        corresponding engineered mature miRNA, and the change of        expression is downregulation;    -   (i) the sequence recognizable by a specific binding agent        includes a transposon recognition sequence, the specific binding        agent is the corresponding transposon, and the change of        expression is upregulation or downregulation;    -   (j) the sequence recognizable by a specific binding agent        includes an ethylene-responsive element        binding-factor-associated amphiphilic repression (EAR) motif        (LxLxL, SEQ ID NO:332 or DLNxxP, SEQ ID NO:333) sequence        (see, e. g., Ragale and Rozwadowski (2011) Epigenetics,        6:141-146), the specific binding agent is ERF        (ethylene-responsive element binding factor) or co-repressor (e.        g., TOPLESS (TPL)), and the change of expression is        downregulation;    -   (k) the sequence recognizable by a specific binding agent        includes a splice site sequence (e. g., a donor site, a        branching site, or an acceptor site; see, for example, the        splice sites and splicing signals publicly available at the ERIS        database, lemur[dot]amu[dot]edu[dot]pl/share/ERISdb/home.html),        the specific binding agent is a spliceosome, and the change of        expression is expression of an alternatively spliced transcript        (in some cases, this can include deletion of a relatively large        genomic sequence, such as deletion of all or part of an exon or        of a protein domain);    -   (l) the sequence recognizable by a specific binding agent        includes a recombinase recognition site sequence that is        recognized by a site-specific recombinase, the specific binding        agent is the corresponding site-specific recombinase, and the        change of expression is upregulation or downregulation or        expression of a transcript having an altered sequence (for        example, expression of a transcript that has had a region of DNA        excised by the recombinase) (non-limiting examples are given        below, under the heading “Recombinases and Recombinase        Recognition Sites”);    -   (m) the sequence recognizable by a specific binding agent        includes sequence encoding an RNA or amino acid aptamer or an        RNA riboswitch, the specific binding agent is the corresponding        ligand, and the change in expression is upregulation or        downregulation;    -   (n) the sequence recognizable by a specific binding agent is a        hormone responsive element (e. g., a nuclear receptor, or a        hormone-binding domain thereof), the specific binding agent is a        hormone, and the change in expression is upregulation or        downregulation; or    -   (o) the sequence recognizable by a specific binding agent is a        transcription factor binding sequence, the specific binding        agent is the corresponding transcription factor, and the change        in expression is upregulation or downregulation (non-limiting        examples are given below, under the heading “Transcription        Factors”).

In embodiments, the polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is integrated (orthat has a sequence that is integrated) at the site of at least onedouble-strand break (DSB) in a genome includes a nucleotide sequencethat encodes an RNA molecule or an amino acid sequence that isrecognizable by a specific binding agent. In embodiments, thepolynucleotide donor molecule includes a nucleotide sequence that bindsspecifically to a ligand or that encodes an RNA molecule or an aminoacid sequence that binds specifically to a ligand. In embodiments, thepolynucleotide donor molecule encodes at least one stop codon on eachstrand, or encodes at least one stop codon within each reading frame oneach strand.

In embodiments, the polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule includes at leastpartially self-complementary sequence, such that the polynucleotidedonor molecule encodes a transcript that is capable of forming at leastpartially double-stranded RNA. In embodiments, the at least partiallydouble-stranded RNA is capable of forming secondary structure containingat least one stem-loop (i. e., a substantially or perfectlydouble-stranded RNA “stem” region and a single-stranded RNA “loop”connecting opposite strands of the dsRNA stem. In embodiments, the atleast partially double-stranded RNA is cleavable by a Dicer or otherribonuclease. In embodiments, the at least partially double-stranded RNAincludes an aptamer or a riboswitch; see, e. g., the RNA aptamersdescribed in U. S. Patent Application Publication 2013/0102651, which isincorporated herein by reference.

In embodiments, the polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is integrated (orthat has a sequence that is integrated) at the site of at least onedouble-strand break (DSB) in a genome includes or encodes a nucleotidesequence that is responsive to a specific change in the physicalenvironment (e. g., a change in light intensity or quality, a change intemperature, a change in pressure, a change in osmotic concentration, achange in day length, or addition or removal of a ligand or specificbinding agent), wherein exposing the integrated polynucleotide sequenceto the specific change in the physical environment results in a changeof expression of the sequence of interest. In embodiments, thepolynucleotide donor molecule includes a nucleotide sequence encoding anRNA molecule or an amino acid sequence that is responsive to a specificchange in the physical environment. In a non-limiting example, thepolynucleotide donor molecule encodes an amino acid sequence that isresponsive to light, oxygen, redox status, or voltage, such as aLight-Oxygen-Voltage (LOV) domain (see, e. g., Peter et al. (2010)Nature Communications, doi:10.1038/ncomms1121) or a PAS domain (see, e.g., Taylor and Zhulin (1999) Microbiol. Mol. Biol. Reviews, 63:479-506),proteins containing such domains, or sub-domains or motifs thereof (see,e. g., the photochemically active 36-residue N-terminal truncation ofthe VVD protein described by Zoltowski et al. (2007) Science,316:1054-1057). In a non-limiting embodiment, integration of a LOVdomain at the site of a DSB within or adjacent to a protein-codingregion is used to create a heterologous fusion protein that can bephoto-activated.

Small RNAs: In an embodiment, the polynucleotide (such as adouble-stranded DNA, a single-stranded DNA, a single-stranded DNA/RNAhybrid, or a double-stranded DNA/RNA hybrid) donor molecule that isintegrated (or that has a sequence that is integrated) at the site of atleast one double-strand break (DSB) in a genome includes DNA thatincludes or encodes a small RNA recognition site sequence that isrecognized by a corresponding mature small RNA. Small RNAs includesiRNAs, microRNAs (miRNAs), trans-acting siRNAs (ta-siRNAs) as describedin U.S. Pat. No. 8,030,473, and phased small RNAs (phased sRNAs) asdescribed in U.S. Pat. No. 8,404,928. All mature small RNAs aresingle-stranded RNA molecules, generally between about 18 to about 26nucleotides in length, which are produced from longer, completely orsubstantially double-stranded RNA (dsRNA) precursors. For example,siRNAs are generally processed from perfectly or near-perfectlydouble-stranded RNA precursors, whereas both miRNAs and phased sRNAs areprocessed from larger precursors that contain at least some mismatched(non-base-paired) nucleotides and often substantial secondary structuresuch as loops and bulges in the otherwise largely double-stranded RNAprecursor. Precursor molecules include naturally occurring precursors,which are often expressed in a specific (e. g., cell- ortissue-specific, temporally specific, developmentally specific, orinducible) expression pattern. Precursor molecules also includeengineered precursor molecules, designed to produce small RNAs (e. g.,artificial or engineered siRNAs or miRNAs) that target specificsequences; see, e. g., U.S. Pat. Nos. 7,691,995 and 7,786,350, which areincorporated herein by reference in their entirety. Thus, inembodiments, the polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is integrated (orthat has a sequence that is integrated) at the site of at least onedouble-strand break (DSB) in a genome includes DNA that includes orencodes a small RNA precursor sequence designed to be processed in vivoto at least one corresponding mature small RNA. In embodiments, thepolynucleotide donor molecule that is integrated (or that has a sequencethat is integrated) at the site of at least one double-strand break(DSB) in a genome includes DNA that includes or encodes an engineeredsmall RNA precursor sequence that is based on a naturally occurring“scaffold” precursor sequence but wherein the nucleotides of the encodedmature small RNA are designed to target a specific gene of interest thatis different from the gene targeted by the natively encoded small RNA;in embodiments, the “scaffold” precursor sequence is one identified fromthe genome of a plant or a pest or pathogen of a plant; see, e. g., U.S.Pat. No. 8,410,334, which discloses transgenic expression of engineeredinvertebrate miRNA precursors in a plant, and which is incorporatedherein by reference in its entirety.

Regardless of the pathway that generates the mature small RNA, themechanism of action is generally similar; the mature small RNA binds ina sequence-specific manner to a small RNA recognition site located on anRNA molecule (such as a transcript or messenger RNA), and the resultingduplex is cleaved by a ribonuclease. The integration of a recognitionsite for a small RNA at the site of a DSB results in cleavage of thetranscript including the integrated recognition site when and where themature small RNA is expressed and available to bind to the recognitionsite. For example, a recognition site sequence for a mature siRNA ormiRNA that is endogenously expressed only in male reproductive tissue ofa plant can be integrated into a DSB, whereby a transcript containingthe recognition site sequence is cleaved only where the mature siRNA ormiRNA is expressed (i. e., in male reproductive tissue); this is useful,e. g., to prevent expression of a protein in male reproductive tissuesuch as pollen, and can be used in applications such as to induce malesterility in a plant or to prevent pollen development or shedding.Similarly, a recognition site sequence for a mature siRNA or miRNA thatis endogenously expressed only in the roots of a plant can be integratedinto a DSB, whereby a transcript containing the recognition sitesequence is cleaved only in roots; this is useful, e. g., to preventexpression of a protein in roots. Non-limiting examples of useful smallRNAs include: miRNAs having tissue-specific expression patternsdisclosed in U.S. Pat. No. 8,334,430, miRNAs having temporally specificexpression patterns disclosed in U.S. Pat. No. 8,314,290, miRNAs withstress-responsive expression patterns disclosed in U.S. Pat. No.8,237,017, siRNAs having tissue-specific expression patterns disclosedin U.S. Pat. No. 9,139,838, and various miRNA recognition site sequencesand the corresponding miRNAs disclosed in U. S. Patent ApplicationPublication 2009/0293148. All of the patent publications referenced inthis paragraph are incorporated herein by reference in their entirety.In embodiments, multiple edits in a genome are employed to obtain adesired phenotype or trait in plant. In an embodiment, one or more edits(addition, deletion, or substitution of one or more nucleotides) of anendogenous nucleotide sequence is made to provide a general phenotype;addition of at least one small RNA recognition site by insertion of therecognition site sequence at a DSB that is functionally linked to theedited endogenous nucleotide sequence achieves more specific control ofexpression of the edited endogenous nucleotide sequence. In an example,an endogenous plant 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS)is edited to provide a glyphosate-resistant EPSPS; for example, suitablechanges include the amino acid substitutions Threonine-102-Isoleucine(T102I) and Proline-106-Serine (P106S) in the maize EPSPS sequenceidentified by Genbank accession number X63374 (see, for example U.S.Pat. No. 6,762,344, incorporated herein by reference). In anotherexample, an endogenous plant acetolactate synthase (ALS) is edited toincrease resistance of the enzyme to various herbicides (e. g.,sulfonylurea, imidazolinone, tirazolopyrimidine,pyrimidinylthiobenzoate, sulfonylamino-carbonyltriazolinone); forexample, suitable changes include the amino acid substitutions G115,A116, P191, A199, K250, M345, D370, V565, W568, and F572 to theNicotiana tabacum ALS enzyme as described in U.S. Pat. No. 5,605,011,which is incorporated herein by reference. The edited herbicide-tolerantenzyme, combined with integration of at least one small RNA recognitionsite for a small RNA (e. g., an siRNA or a miRNA) expressed only in aspecific tissue (for example, miRNAs specifically expressed in malereproductive tissue or female reproductive tissue, e. g., the miRNAsdisclosed in Table 6 of U.S. Pat. No. 8,334,430 or the siRNAs disclosedin U.S. Pat. No. 9,139,838, both incorporated herein by reference) at aDSB functionally linked to (e. g., in the 3′ untranslated region of) theedited herbicide-tolerant enzyme results in expression of the editedherbicide-tolerant enzyme being restricted to tissues other than thosein which the small RNA is endogenously expressed, and those tissues inwhich the small RNA is expressed will not be resistant to herbicideapplication; this approach is useful, e. g., to provide male-sterile orfemale-sterile plants.

In other embodiments, the sequence of an endogenous genomic locusencoding one or more small RNAs (e. g., miRNAs, siRNAs, ta-siRNAs) isaltered in order to express a small RNA having a sequence that isdifferent from that of the endogenous small RNA and is designed totarget a new sequence of interest (e. g., a sequence of a plant pest,plant pathogen, symbiont of a plant, or symbiont of a plant pest orpathogen). For example, the sequence of an endogenous or native genomiclocus encoding a miRNA precursor can be altered in the mature miRNA andthe miR* sequences, while maintaining the secondary structure in theresulting altered miRNA precursor sequence to permit normal processingof the transcript to a mature miRNA with a different sequence from theoriginal, native mature miRNA sequence; see, for example, U.S. Pat. Nos.7,786,350 and 8,395,023, both of which are incorporated by reference intheir entirety herein, and which teach methods of designing engineeredmiRNAs. In embodiments, the sequence of an endogenous genomic locusencoding one or more small RNAs (e. g., miRNAs, siRNAs, ta-siRNAs) isaltered in order to express one or more small RNA cleavage blockers(see, e. g., U.S. Pat. No. 9,040,774, which is incorporated by referencein its entirety herein). In embodiments, the sequence of an endogenousgenomic locus is altered to encode a small RNA decoy (e. g., U.S. Pat.No. 8,946,511, which is incorporated by reference in its entiretyherein). In embodiments, the sequence of an endogenous genomic locusthat natively contains a small RNA (e. g., miRNA, siRNA, or ta-siRNA)recognition or cleavage site is altered to delete or otherwise mutatethe recognition or cleavage site and thus decouple the genomic locusfrom small RNA regulation.

Recombinases and Recombinase Recognition Sites: In an embodiment, thepolynucleotide (such as a double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, or a double-stranded DNA/RNA hybrid)donor molecule that is integrated (or that has a sequence that isintegrated) at the site of at least one double-strand break (DSB) in agenome includes DNA that includes or encodes a recombinase recognitionsite sequence that is recognized by a site-specific recombinase, thespecific binding agent is the corresponding site-specific recombinase,and the change of expression is upregulation or downregulation orexpression of a transcript having an altered sequence (for example,expression of a transcript that has had a region of DNA excised by therecombinase). The term “recombinase recognition site sequence” refers tothe DNA sequences (usually a pair of sequences) that are recognized by asite-specific (i. e., sequence-specific) recombinase in a process thatallows the excision (or, in some cases, inversion or translocation) ofthe DNA located between the sequence-specific recombination sites. Forinstance, Cre recombinase recognizes either loxP recombination sites orlox511 recombination sites which are heterospecific, which means thatloxP and lox511 do not recombine together (see, e. g., Odell et al.(1994) Plant Physiol., 106:447-458); FLP recombinase recognizes frtrecombination sites (see, e. g., Lyznik et al. (1996) Nucleic AcidsRes., 24:3784-3789); R recombinase recognizes Rs recombination sites(see, e. g., Onounchi et al. (1991) Nucleic Acids Res., 19:6373-6378);Dre recombinase recognizes rox sites (see, e. g., U.S. Pat. No.7,422,889, incorporated herein by reference); and Gin recombinaserecognizes gix sites (see, e. g., Maeser et al. (1991) Mol. Gen. Genet.,230:170-176). In a non-limiting example, a pair of polynucleotidesencoding loxP recombinase recognition site sequences encoded by a pairof polynucleotide donor molecules are integrated at two separate DSBs;in the presence of the corresponding site-specific DNA recombinase Cre,the genomic sequence flanked on either side by the integrated loxPrecognition sites is excised from the genome (for loxP sequences thatare integrated in the same orientation relative to each other within thegenome) or is inverted (for loxP sites that are integrated in aninverted orientation relative to each other within the genome) or istranslocated (for loxP sites that are integrated on separate DNAmolecules); such an approach is useful, e. g., for deletion orreplacement of larger lengths of genomic sequence, for example, deletionor replacement of one or more protein domains. In embodiments, therecombinase recognition site sequences that are integrated at twoseparate DSBs are heterospecific, i. e., will not recombine together;for example, Cre recombinase recognizes either loxP recombination sitesor lox511 recombination sites which are heterospecific relative to eachother, which means that a loxP site and a lox511 site will not recombinetogether but only with another recombination site of its own type.

Integration of recombinase recognition sites is useful in plantbreeding; in an embodiment, the method is used to provide a first parentplant having recombinase recognition site sequences heterologouslyintegrated at two separate DSBs; crossing this first parent plant to asecond parent plant that expresses the corresponding recombinase resultsin progeny plants in which the genomic sequence flanked on either sideby the heterologously integrated recognition sites is excised from (orin some cases, inverted in) the genome. This approach is useful, e. g.,for deletion of relatively large regions of DNA from a genome, forexample, for excising DNA encoding a selectable or screenable markerthat was introduced using transgenic techniques. Examples ofheterologous arrangements or integration patterns of recombinaserecognition sites and methods for their use, particularly in plantbreeding, are disclosed in U.S. Pat. No. 8,816,153 (see, for example,the Figures and working examples), the entire specification of which isincorporated herein by reference.

Transcription Factors: In an embodiment, the sequence encoded by thedonor polynucleotide (such as a double-stranded DNA, a single-strandedDNA, a single-stranded DNA/RNA hybrid, or a double-stranded DNA/RNAhybrid) donor molecule that is integrated (or that has a sequence thatis integrated) at the site of at least one double-strand break (DSB) ina genome includes a transcription factor binding sequence, the specificbinding agent is the corresponding transcription factor (or morespecifically, the DNA-binding domain of the corresponding transcriptionfactor), and the change in expression is upregulation or downregulation(depending on the type of transcription factor involved). In anembodiment, the transcription factor is an activating transcriptionfactor or activator, and the change in expression is upregulation orincreased expression increased expression (e.g., increased expression ofat least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%, 60%, 70%, 80%, 90%,95%, 100% or greater, e.g., at least a 2-fold, 5-fold, 10-fold, 20-fold,30-fold, 40-fold, 50-fold change, 100-fold or even 1000-fold change orgreater) of a sequence of interest to which the transcription factorbinding sequence, when integrated at a DSB in the genome, is operablylinked. In some embodiments, expression is increased between 10-100%;between 2-fold and 5-fold; between 2 and 10-fold; between 10-fold and50-fold; between 10-fold and a 100-fold; between 100-fold and 1000-fold;between 1000-fold and 5,000-fold; between 5,000-fold and 10,000 fold. Insome embodiments, a targeted insertion may decrease expression by atleast 1%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%, 60%, 70%, 80%,90%, 95%, 99% or more. In another embodiment, the transcription factoris a repressing transcription factor or repressor, and the change inexpression is downregulation or decreased expression (e.g., decreasedexpression by at least 1%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%,60%, 70%, 80%, 90%, 95%, 99% or more) of a sequence of interest to whichthe transcription factor binding sequence when integrated at a DSB inthe genome, is operably linked. Embodiments of transcription factorsinclude hormone receptors, e. g., nuclear receptors, which include botha hormone-binding domain and a DNA-binding domain; in embodiments, thepolynucleotide donor molecule that is integrated (or that has a sequencethat is integrated) at the site of at least one double-strand break(DSB) in a genome includes or encodes a hormone-binding domain of anuclear receptor or a DNA-binding domain of a nuclear receptor. Variousnon-limiting examples of transcription factor binding sequences andtranscription factors are provided in the working Examples. Inembodiments, the sequence recognizable by a specific binding agent is atranscription factor binding sequence selected from those publiclydisclosed atarabidopsis[dot]med[dot]ohio-state[dot]edu/AtcisDB/bindingsites[dot]htmland neomorph[dot]salk[dot]edu/dap_web/pages/index[dot]php.

To summarize, the methods described herein permit sequences encoded bydonor polynucleotides to be inserted, in a non-multiplexed ormultiplexed manner, into a plant cell genome for the purpose ofmodulating gene expression in a number of distinct ways. Gene expressioncan be modulated up or down, for example, by tuning expression throughthe insertion of enhancer elements and transcription start sequences(e.g., nitrate response elements and auxin binding elements).Conditional transcription factor binding sites can be added or modifiedto allow additional control. Similarly, transcript stabilizing and/ordestabilizing sequences can be inserted using the methods herein. Viathe targeted insertion of stop codons, RNAi cleavage sites, or sites forrecombinases, the methods described herein allow the transcription ofparticular sequences to be selectively turned off (likewise, thetargeted removal of such sequences can be used to turn genetranscription on).

The plant genome targeting methods disclosed herein also enabletranscription rates to be adjusted by the modification (optimization orde-optimization) of core promoter sequences (e.g., TATAA boxes).Proximal control elements (e.g., GC boxes; CAAT boxes) can likewise bemodified Enhancer or repressor motifs can be inserted or modified.Three-dimensional structural barriers in DNA that inhibit RNA polymerasecan be created or removed via the targeted insertion of sequences, or bythe modification of existing sequences. Where intron mediatedenhancement is known to affect transcript rate, the relevantrate-affecting sequences can be optimized or de-optimized (by insertionof additional sequences or modification of existing sequences) tofurther enhance or diminish transcription. Through the insertion ormodification of sequences using the targeting methods described herein(including multiplexed targeting methods), mRNA stability and processingcan be modulated (thereby modulating gene expression). For example, mRNAstabilizing or destabilizing motifs can be inserted, removed ormodified; mRNA splicing donor/acceptor sites can be inserted, removed ormodified and, in some instance, create the possibility of increasedcontrol over alternate splicing. Similarly, miRNA binding sites can beadded, removed or modified using the methods described herein.Epigenetic regulation of transcription can also be adjusted according tothe methods described herein (e.g., by increasing or decreasing thedegree of methylation of DNA, or the degree of methylation oracetylation of histones). Epigenetic regulation using the tools andmethods described herein can be combined with other methods formodifying genetic sequences described herein, for the purpose ofmodifying a trait of a plant cell or plant, or for creating populationsof modified cells and cells from which desired phenotypes can beselected.

The plant genome targeting methods described herein can also be used tomodulate translation efficiency by, e.g., modifying codon usage towardsor away from a particular plant cell's bias. Similarly, through the useof the targeting methods described herein, KOZAK sequences can beoptimized or deoptimized, mRNA folding and structures affectinginitiation of translation can be altered, and upstream reading framescan be created or destroyed. Through alteration of coding sequencesusing the targeted genome modification methods described herein, theabundance and/or activity of translated proteins can be adjusted. Forexample, the amino acid sequences in active sites or functional sites ofproteins can be modified to increase or decrease the activity of theprotein as desired; in addition, or alternatively, protein stabilizingor destabilizing motifs can be added or modified. All of the geneexpression and activity modification schemes described herein can beutilized in various combinations to fine-tune gene expression andactivity. Using the multiplexed targeting methods described herein, aplurality of specific targeted modifications can be achieved in a plantcell without intervening selection or sequencing steps.

Modified Plant Cells Comprising Specifically Targeted and ModifiedGenomes

Another aspect of the invention includes the cell, such as a plant cell,provided by the methods disclosed herein. In an embodiment, a plant cellthus provided includes in its genome a heterologous DNA sequence thatincludes: (a) nucleotide sequence of a polynucleotide (such as adouble-stranded DNA, a single-stranded DNA, a single-stranded DNA/RNAhybrid, or a double-stranded DNA/RNA hybrid) molecule integrated at thesite of a DSB in a genome; and (b) genomic nucleotide sequence adjacentto the site of the DSB. In embodiments, the methods disclosed herein forintegrating a sequence encoded by a polynucleotide donor molecule intothe site of a DSB are applied to a plant cell (e. g., a plant cell orplant protoplast isolated from a whole plant or plant part or planttissue, or an isolated plant cell or plant protoplast in suspension orplate culture); in other embodiments, the methods are applied tonon-isolated plant cells in situ or in planta, such as a plant celllocated in an intact or growing plant or in a plant part or tissue. Themethods disclosed herein for integrating a sequence encoded by apolynucleotide donor molecule into the site of a DSB are also useful inintroducing heterologous sequence at the site of a DSB induced in thegenome of other photosynthetic eukaryotes (e. g., green algae, redalgae, diatoms, brown algae, and dinoflagellates). In embodiments, theplant cell or plant protoplast is capable of division and furtherdifferentiation. In embodiments, the plant cell or plant protoplast isobtained or isolated from a plant or part of a plant selected from thegroup consisting of a plant tissue, a whole plant, an intact nodal bud,a shoot apex or shoot apical meristem, a root apex or root apicalmeristem, lateral meristem, intercalary meristem, a seedling (e. g., agerminating seed or small seedling or a larger seedling with one or moretrue leaves), a whole seed (e. g., an intact seed, or a seed with partor all of its seed coat removed or treated to make permeable), a halvedseed or other seed fragment, a zygotic or somatic embryo (e. g., amature dissected zygotic embryo, a developing zygotic or somatic embryo,a dry or rehydrated or freshly excised zygotic embryo), pollen,microspores, epidermis, flower, and callus.

In some embodiments, the method includes the additional step of growingor regenerating a plant from a plant cell containing the heterologousDNA sequence of the polynucleotide donor molecule integrated at the siteof a DSB and genomic nucleotide sequence adjacent to the site of theDSB, wherein the plant includes at least some cells that contain theheterologous DNA sequence of the polynucleotide donor moleculeintegrated at the site of a DSB and genomic nucleotide sequence adjacentto the site of the DSB. In embodiments, callus is produced from theplant cell, and plantlets and plants produced from such callus. In otherembodiments, whole seedlings or plants are grown directly from the plantcell without a callus stage. Thus, additional related aspects aredirected to whole seedlings and plants grown or regenerated from theplant cell or plant protoplast containing sequence encoded by apolynucleotide (such as a double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, or a double-stranded DNA/RNA hybrid)donor molecule heterologously integrated at the site of a DSB, as wellas the seeds of such plants; embodiments include whole seedlings andplants grown or regenerated from the plant cell or plant protoplastcontaining sequence encoded by a polynucleotide donor moleculeheterologously integrated at the site of two or more DSBs, as well asthe seeds of such plants. In embodiments, the grown or regenerated plantexhibits a phenotype associated with the sequence encoded by apolynucleotide donor molecule heterologously integrated at the site of aDSB. In embodiments, the grown or regenerated plant includes in itsgenome two or more genetic modifications that in combination provide atleast one phenotype of interest, wherein at least one of the two or moregenetic modifications includes the sequence encoded by a polynucleotidedonor molecule heterologously integrated at the site of a DSB in thegenome, or wherein the two or more genetic modifications includesequence encoded by at least one polynucleotide donor heterologouslyintegrated at two or more DSBs in the genome, or wherein the two or moregenetic modifications include sequences encoded by multiplepolynucleotides donor molecules heterologously integrated at differentDSBs in the genome. In embodiments, a heterogeneous population of plantcells or plant protoplasts, at least some of which include sequenceencoded by at least one polynucleotide donor molecule heterologouslyintegrated at the site of a DSB, is provided by the method; relatedaspects include a plant having a phenotype of interest associated withsequence encoded by the polynucleotide donor molecule heterologouslyintegrated at the site of a DSB, provided by either regeneration of aplant having the phenotype of interest from a plant cell or plantprotoplast selected from the heterogeneous population of plant cells orplant protoplasts, or by selection of a plant having the phenotype ofinterest from a heterogeneous population of plants grown or regeneratedfrom the population of plant cells or plant protoplasts. Examples ofphenotypes of interest include (but are not limited to) herbicideresistance; improved tolerance of abiotic stress (e. g., tolerance oftemperature extremes, drought, or salt) or biotic stress (e. g.,resistance to bacterial or fungal pathogens); improved utilization ofnutrients or water; synthesis of new or modified amounts of lipids,carbohydrates, proteins or other chemicals, including medicinalcompounds; improved flavour or appearance; improved photosynthesis;improved storage characteristics (e. g., resistance to bruising,browning, or softening); increased yield; altered morphology (e. g.,floral architecture or colour, plant height, branching, root structure);and changes in flowering time. In an embodiment, a heterogeneouspopulation of plant cells or plant protoplasts (or seedlings or plantsgrown or regenerated therefrom) is exposed to conditions permittingexpression of the phenotype of interest; e. g., selection for herbicideresistance can include exposing the population of plant cells or plantprotoplasts (or seedlings or plants) to an amount of herbicide or othersubstance that inhibits growth or is toxic, allowing identification andselection of those resistant plant cells or plant protoplasts (orseedlings or plants) that survive treatment. In certain embodiments, aproxy measurement can be taken of an aspect of a modified plant or plantcell, where the measurement is indicative of a desired phenotype ortrait. For example, the modification of one or more targeted sequencesin a genome may provide a measurable change in a molecule (e.g., adetectable change in the structure of a molecule, or a change in theamount of the molecule that is detected, or the presence or absence of amolecule) that can be used as a biomarker for a presence of a desiredphenotype or trait. The proper insertion of an enhancer for increasingexpression of an enzyme, for example, may be determined by detectinglower levels of the enyzme's substrate.

In some embodiments, modified plants are produced from cells modifiedaccording to the methods described herein without a tissue culturingstep. In certain embodiments, the modified plant cell or plant does nothave significant losses of methylation compared to a non-modified parentplant cell or plant. For example, the modified plant lacks significantlosses of methylation in one or more promoter regions relative to theparent plant cell or plant. Similarly, in certain embodiments, anmodified plant or plant cell obtained using the methods described hereinlacks significant losses of methylation in protein coding regionsrelative to the parent cell or parent plant before modification usingthe modifying methods described herein.

Also contemplated are new heterogeneous populations, arrays, orlibraries of plant cells and plants created by the introduction oftargeted modifications at one more locations in the genome. Plantcompositions of the invention include succeeding generations or seeds ofmodified plants that are grown or regenerated from plant cells or plantprotoplasts modified according to the methods herein, as well as partsof those plants (including plant parts used in grafting as scions orrootstocks), or products (e. g., fruits or other edible plant parts,cleaned grains or seeds, edible oils, flours or starches, proteins, andother processed products) made from these plants or their seeds.Embodiments include plants grown or regenerated from the plant cells orplant protoplasts, wherein the plants contain cells or tissues that donot have sequence encoded by the polynucleotide donor moleculeheterologously integrated at the site of a DSB, e. g., grafted plants inwhich the scion or rootstock contains sequence encoded by thepolynucleotide donor molecule heterologously integrated at the site of aDSB, or chimeric plants in which some but not all cells or tissuescontain sequence encoded by the polynucleotide donor moleculeheterologously integrated at the site of a DSB. Plants in which graftingis commonly useful include many fruit trees and plants such as manycitrus trees, apples, stone fruit (e. g., peaches, apricots, cherries,and plums), avocados, tomatoes, eggplant, cucumber, melons, watermelons,and grapes as well as various ornamental plants such as roses. Graftedplants can be grafts between the same or different (generally related)species. Additional related aspects include (a) a hybrid plant providedby crossing a first plant grown or regenerated from a plant cell orplant protoplast with sequence encoded by at least one polynucleotidedonor molecule heterologously integrated at the site of a DSB, with asecond plant, wherein the hybrid plant contains sequence encoded by thepolynucleotide donor molecule heterologously integrated at the site of aDSB, and (b) a hybrid plant provided by crossing a first plant grown orregenerated from a plant cell or plant protoplast with sequence encodedby at least one polynucleotide donor molecule heterologously integratedat multiple DSB sites, with a second plant, wherein the hybrid plantcontains sequence encoded by at least one polynucleotide donor moleculeheterologously integrated at the site of at least one DSB; alsocontemplated is seed produced by the hybrid plant. Also envisioned asrelated aspects are progeny seed and progeny plants, including hybridseed and hybrid plants, having the regenerated plant as a parent orancestor. In embodiments, the plant cell (or the regenerated plant,progeny seed, and progeny plant) is diploid or polyploid. Inembodiments, the plant cell (or the regenerated plant, progeny seed, andprogeny plant) is haploid or can be induced to become haploid;techniques for making and using haploid plants and plant cells are knownin the art, see, e. g., methods for generating haploids in Arabidopsisthaliana by crossing of a wild-type strain to a haploid-inducing strainthat expresses altered forms of the centromere-specific histone CENH3,as described by Maruthachalam and Chan in “How to make haploidArabidopsis thaliana”, a protocol publicly available atwww[dot]openwetware[dot]org/images/d/d3/Haploid_Arabidopsisprotocol[dot]pdf;Ravi et al. (2014) Nature Communications, 5:5334, doi:10.1038/ncomms6334). Examples of haploid cells include but are notlimited to plant cells obtained from haploid plants and plant cellsobtained from reproductive tissues, e. g., from flowers, developingflowers or flower buds, ovaries, ovules, megaspores, anthers, pollen,and microspores. In embodiments where the plant cell is haploid, themethod can further include the step of chromosome doubling (e. g., byspontaneous chromosomal doubling by meiotic non-reduction, or by using achromosome doubling agent such as colchicine, oryzalin, trifluralin,pronamide, nitrous oxide gas, anti-microtubule herbicides,anti-microtubule agents, and mitotic inhibitors) in the plant cellcontaining heterologous DNA sequence (i. e. sequence of thepolynucleotide donor molecule integrated at the site of a DSB in thegenome and genomic nucleotide sequence adjacent to the site of the DSB)to produce a doubled haploid plant cell or plant protoplast that ishomozygous for the heterologous DNA sequence; yet other embodimentsinclude regeneration of a doubled haploid plant from the doubled haploidplant cell or plant protoplast, wherein the regenerated doubled haploidplant is homozygous for the heterologous DNA sequence. Thus, aspects ofthe invention are related to the haploid plant cell or plant protoplasthaving the heterologous DNA sequence of the polynucleotide donormolecule integrated at the site of a DSB and genomic nucleotide sequenceadjacent to the site of the DSB, as well as a doubled haploid plant cellor plant protoplast or a doubled haploid plant that is homozygous forthe heterologous DNA sequence. Another aspect of the invention isrelated to a hybrid plant having at least one parent plant that is adoubled haploid plant provided by the method. Production of doubledhaploid plants by these methods provides homozygosity in one generation,instead of requiring several generations of self-crossing to obtainhomozygous plants; this may be particularly advantageous in slow-growingplants, such as fruit and other trees, or for producing hybrid plantsthat are offspring of at least one doubled-haploid plant.

Plants and plant cells that may be modified according to the methodsdescribed herein are of any species of interest, including dicots andmonocots, but especially soybean species (including hybrid species).

The soybean cells and derivative plants and seeds disclosed herein canbe used for various purposes useful to the consumer or grower. Theintact plant itself may be desirable, e. g., plants grown as cover cropsor as ornamentals. In other embodiments, processed products are madefrom the plant or its seeds, such as extracted proteins, oils, sugars,and starches, fermentation products, animal feed or human food, wood andwood products, pharmaceuticals, and various industrial products. Thus,further related aspects of the invention include a processed orcommodity product made from a plant or seed or plant part that includesat least some cells that contain the heterologous DNA sequence includingthe sequence encoded by the polynucleotide donor molecule integrated atthe site of a DSB and genomic nucleotide sequence adjacent to the siteof the DSB. Commodity products include, but are not limited to,harvested leaves, roots, shoots, tubers, stems, fruits, seeds, or otherparts of a plant, meals, oils (edible or inedible), fiber, extracts,fermentation or digestion products, crushed or whole grains or seeds ofa plant, wood and wood pulp, or any food or non-food product. Detectionof a heterologous DNA sequence that includes: (a) nucleotide sequenceencoded by a polynucleotide donor molecule integrated at the site of aDSB in a genome; and (b) genomic nucleotide sequence adjacent to thesite of the DSB in such a commodity product is de facto evidence thatthe commodity product contains or is derived from a plant cell, plant,or seed of this invention.

In another aspect, the invention provides a heterologous nucleotidesequence including: (a) nucleotide sequence encoded by a polynucleotidedonor molecule integrated by the methods disclosed herein at the site ofa DSB in a genome, and (b) genomic nucleotide sequence adjacent to thesite of the DSB. Related aspects include a plasmid, vector, orchromosome including such a heterologous nucleotide sequence, as well aspolymerase primers for amplification (e. g., PCR amplification) of sucha heterologous nucleotide sequence.

Compositions and Reaction Mixtures

In one aspect, the invention provides a composition including: (a) acell; and (b) a polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is capable of beingintegrated (or having its sequence integrated) (preferably bynon-homologous end-joining (NHEJ)) at one or more double-strand breaksin a genome in the cell. In many embodiments of the composition, thecell is a plant cell, e. g., an isolated plant cell or a plantprotoplast, or a plant cell in a plant, plant part, plant tissue, orcallus. In certain embodiments, the cell is that of a photosyntheticeukaryote (e. g., green algae, red algae, diatoms, brown algae, anddinoflagellates).

In various embodiments of the composition, the plant cell is a plantcell or plant protoplast isolated from a whole plant or plant part orplant tissue (e. g., a plant cell or plant protoplast cultured in liquidmedium or on solid medium), or a plant cell located in callus, an intactplant, seed, or seedling, or in a plant part or tissue. In embodiments,the plant cell is a cell of a monocot plant or of a dicot plant. In manyembodiments, the plant cell is a plant cell capable of division and/ordifferentiation, including a plant cell capable of being regeneratedinto callus or a plant. In embodiments, the plant cell is capable ofdivision and further differentiation, even capable of being regeneratedinto callus or into a plant. In embodiments, the plant cell is diploid,polyploid, or haploid (or can be induced to become haploid).

In embodiments, the composition includes a plant cell that includes atleast one double-strand break (DSB) in its genome. Alternatively, thecomposition includes a plant cell in which at least one DSB will beinduced in its genome, for example, by providing at least oneDSB-inducing agent to the plant cell, e. g., either together with thepolynucleotide (such as a double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, or a double-stranded DNA/RNA hybrid)donor molecule or separately. Thus, the composition optionally furtherincludes at least one DSB-inducing agent. In embodiments, thecomposition optionally further includes at least one chemical,enzymatic, or physical delivery agent, or a combination thereof; suchdelivery agents and methods for their use are described in detail in theparagraphs following the heading “Delivery Methods and Delivery Agents”.In embodiments, the DSB-inducing agent is at least one of the groupconsisting of:

-   -   (a) a nuclease selected from the group consisting of an        RNA-guided nuclease, an RNA-guided DNA endonuclease, a type II        Cas nuclease, a Cas9, a type V Cas nuclease, a Cpf1, a CasY, a        CasX, a C2c1, a C2c3, an engineered nuclease, a codon-optimized        nuclease, a zinc-finger nuclease (ZFN), a transcription        activator-like effector nuclease (TAL-effector nuclease), an        Argonaute, and a meganuclease or engineered meganuclease;    -   (b) a polynucleotide encoding one or more nucleases capable of        effecting site-specific alteration (such as introduction of a        DSB) of a target nucleotide sequence; and    -   (c) a guide RNA (gRNA) for an RNA-guided nuclease, or a DNA        encoding a gRNA for an RNA-guided nuclease.

In embodiments, the composition includes (a) a cell; (b) apolynucleotide (such as a double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, or a double-stranded DNA/RNA hybrid)donor molecule, capable of being integrated (or having its sequenceintegrated) at a DSB; (c) a Cas9, a Cpf1, a CasY, a CasX, a C2c1, or aC2c3 nuclease; and (d) at least one guide RNA. In an embodiment, thecomposition includes (a) a cell; (b) a polynucleotide (such as adouble-stranded DNA, a single-stranded DNA, a single-stranded DNA/RNAhybrid, or a double-stranded DNA/RNA hybrid) donor molecule, capable ofbeing integrated (or having its sequence integrated) at a DSB; (c) atleast one ribonucleoprotein including a CRISPR nuclease and a guide RNA.

In embodiments of the composition, the polynucleotide donor molecule isdouble-stranded and blunt-ended, or is double stranded and has anoverhang or “sticky end” consisting of unpaired nucleotides (e. g., 1,2, 3, 4, 5, or 6 unpaired nucleotides) at one terminus or both termini;in other embodiments, the polynucleotide donor molecule is asingle-stranded DNA or a single-stranded DNA/RNA hybrid. In anembodiment, the polynucleotide donor molecule is a double-stranded DNAor DNA/RNA hybrid molecule that is blunt-ended or that has an overhangat one terminus or both termini, and that has about 18 to about 300base-pairs, or about 20 to about 200 base-pairs, or about 30 to about100 base-pairs, and having at least one phosphorothioate bond betweenadjacent nucleotides at a 5′ end, 3′ end, or both 5′ and 3′ ends. Inembodiments, the polynucleotide donor molecule is a double-stranded DNA,a single-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid, and includes single strands of at least11, at least 18, at least 20, at least 30, at least 40, at least 60, atleast 80, at least 100, at least 120, at least 140, at least 160, atleast 180, at least 200, at least 240, at least 280, or at least 320nucleotides. In embodiments, the polynucleotide donor molecule includeschemically modified nucleotides; in embodiments, the naturally occurringphosphodiester backbone of the polynucleotide molecule is partially orcompletely modified with phosphorothioate, phosphorodithioate, ormethylphosphonate internucleotide linkage modifications, or thepolynucleotide donor molecule includes modified nucleoside bases ormodified sugars, or the polynucleotide donor molecule is labelled with afluorescent moiety or other detectable label. In an embodiment, thepolynucleotide donor molecule is double-stranded and perfectlybase-paired through all or most of its length, with the possibleexception of any unpaired nucleotides at either terminus or bothtermini. In another embodiment, the polynucleotide donor molecule isdouble-stranded and includes one or more non-terminal mismatches ornon-terminal unpaired nucleotides within the otherwise double-strandedduplex. Other related embodiments include single- or double-strandedDNA/RNA hybrid donor molecules. Additional description of thepolynucleotide donor molecule is found above in the paragraphs followingthe heading “Polynucleotide Molecules”.

In embodiments of the composition, the polynucleotide donor moleculeincludes:

-   -   (a) a nucleotide sequence that is recognizable by a specific        binding agent;    -   (b) a nucleotide sequence encoding an RNA molecule or an amino        acid sequence that is recognizable by a specific binding agent;    -   (c) a nucleotide sequence that encodes an RNA molecule or an        amino acid sequence that binds specifically to a ligand;    -   (d) a nucleotide sequence that is responsive to a specific        change in the physical environment; or    -   (e) a nucleotide sequence encoding an RNA molecule or an amino        acid sequence that is responsive to a specific change in the        physical environment;    -   (f) a nucleotide sequence encoding at least one stop codon on        each strand;    -   (g) a nucleotide sequence encoding at least one stop codon        within each reading frame on each strand; or    -   (h) at least partially self-complementary sequence, such that        the polynucleotide molecule encodes a transcript that is capable        of forming at least partially double-stranded RNA; or    -   (i) a combination of any of (a)-(h).

Additional description relating to these various embodiments ofnucleotide sequences included in the polynucleotide donor molecule isfound in the section headed “Methods of changing expression of asequence of interest in a genome”.

In another aspect, the invention provides a reaction mixture including:(a) a plant cell having a double-strand break (DSB) at least one locusin its genome; and (b) a polynucleotide (such as a double-stranded DNA,a single-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule capable of beingintegrated or inserted (or having its sequence integrated or inserted)at the DSB (preferably by non-homologous end-joining (NHEJ)), with alength of between about 18 to about 300 base-pairs (or nucleotides, ifsingle-stranded), or between about 30 to about 100 base-pairs (ornucleotides, if single-stranded); wherein sequence encoded by thepolynucleotide donor molecule, if integrated at the DSB, forms aheterologous insertion (that is to say, resulting in a concatenatednucleotide sequence that is a combination of the sequence of thepolynucleotide molecule and at least some of the genomic sequenceadjacent to the site of DSB, wherein the concatenated sequence isheterologous, i. e., would not otherwise or does not normally occur atthe site of insertion). In embodiments, the product of the reactionmixture includes a plant cell in which sequence encoded by thepolynucleotide donor molecule has been integrated at the site of theDSB.

In many embodiments of the reaction mixture, the cell is a plant cell,e. g., an isolated plant cell or a plant protoplast, or a plant cell ina plant, plant part, plant tissue, or callus. In various embodiments ofthe reaction mixture, the plant cell is a plant cell or plant protoplastisolated from a whole plant or plant part or plant tissue (e. g., aplant cell or plant protoplast cultured in liquid medium or on solidmedium), or a plant cell located in callus, an intact plant, seed, orseedling, or in a plant part or tissue. In embodiments, the plant cellis a cell of a monocot plant or of a dicot plant. In many embodiments,the plant cell is a plant cell capable of division and/ordifferentiation, including a plant cell capable of being regeneratedinto callus or a plant. In embodiments, the plant cell is capable ofdivision and further differentiation, even capable of being regeneratedinto callus or into a plant. In embodiments, the plant cell is diploid,polyploid, or haploid (or can be induced to become haploid).

In embodiments, the reaction mixture includes a plant cell that includesat least one double-strand break (DSB) in its genome. Alternatively, thereaction mixture includes a plant cell in which at least one DSB will beinduced in its genome, for example, by providing at least oneDSB-inducing agent to the plant cell, e. g., either together with apolynucleotide (such as a double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, or a double-stranded DNA/RNA hybrid)donor molecule capable of being integrated or inserted (or having itssequence integrated or inserted) at the DSB, or separately. Thus, thereaction mixture optionally further includes at least one DSB-inducingagent. In embodiments, the reaction mixture optionally further includesat least one chemical, enzymatic, or physical delivery agent, or acombination thereof; such delivery agents and methods for their use aredescribed in detail in the paragraphs following the heading “DeliveryMethods and Delivery Agents”. In embodiments, the DSB-inducing agent isat least one of the group consisting of:

-   -   (a) a nuclease selected from the group consisting of an        RNA-guided nuclease, an RNA-guided DNA endonuclease, a type II        Cas nuclease, a Cas9, a type V Cas nuclease, a Cpf1, a CasY, a        CasX, a C2c1, a C2c3, an engineered nuclease, a codon-optimized        nuclease, a zinc-finger nuclease (ZFN), a transcription        activator-like effector nuclease (TAL-effector nuclease), an        Argonaute, and a meganuclease or engineered meganuclease;    -   (b) a polynucleotide encoding one or more nucleases capable of        effecting site-specific alteration (such as introduction of a        DSB) of a target nucleotide sequence; and    -   (c) a guide RNA (gRNA) for an RNA-guided nuclease, or a DNA        encoding a gRNA for an RNA-guided nuclease.

In embodiments, the reaction mixture includes (a) a plant cell; (b) apolynucleotide (such as a double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, or a double-stranded DNA/RNA hybrid)donor molecule capable of being integrated or inserted (or having itssequence integrated or inserted) at the DSB; (c) a Cas9, a Cpf1, a CasY,a CasX, a C2c1, or a C2c3 nuclease; and (d) at least one guide RNA. Inan embodiment, the reaction mixture includes (a) a plant cell or a plantprotoplast; (b) a polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule capable of beingintegrated or inserted (or having its sequence integrated or inserted)at the DSB; (c) at least one ribonucleoprotein including a CRISPRnuclease and a guide RNA. In an embodiment, the reaction mixtureincludes (a) plant cell or a plant protoplast; (b) a polynucleotide(such as a double-stranded DNA, a single-stranded DNA, a single-strandedDNA/RNA hybrid, or a double-stranded DNA/RNA hybrid) donor moleculecapable of being integrated or inserted (or having its sequenceintegrated or inserted) at the DSB; (c) at least one ribonucleoproteinincluding Cas9 and an sgRNA.

In embodiments of the reaction mixture, the polynucleotide (such as adouble-stranded DNA, a single-stranded DNA, a single-stranded DNA/RNAhybrid, or a double-stranded DNA/RNA hybrid) donor molecule includes:

-   -   (a) a nucleotide sequence that is recognizable by a specific        binding agent;    -   (b) a nucleotide sequence encoding an RNA molecule or an amino        acid sequence that is recognizable by a specific binding agent;    -   (c) a nucleotide sequence that encodes an RNA molecule or an        amino acid sequence that binds specifically to a ligand;    -   (d) a nucleotide sequence that is responsive to a specific        change in the physical environment; or    -   (e) a nucleotide sequence encoding an RNA molecule or an amino        acid sequence that is responsive to a specific change in the        physical environment;    -   (f) a nucleotide sequence encoding at least one stop codon on        each strand;    -   (g) a nucleotide sequence encoding at least one stop codon        within each reading frame on each strand; or    -   (h) at least partially self-complementary sequence, such that        the polynucleotide molecule encodes a transcript that is capable        of forming at least partially double-stranded RNA; or    -   (i) a combination of any of (a)-(h).

Additional description relating to these various embodiments ofnucleotide sequences included in the polynucleotide donor molecule isfound in the section headed “Methods of changing expression of asequence of interest in a genome”.

Polynucleotides for Disrupting Gene Expression

In another aspect, the invention provides a polynucleotide (such as adouble-stranded DNA, a single-stranded DNA, a single-stranded DNA/RNAhybrid, or a double-stranded DNA/RNA hybrid) molecule for disruptinggene expression, including double-stranded polynucleotides containing atleast 18 base-pairs and encoding at least one stop codon in eachpossible reading frame on each strand and single-strandedpolynucleotides containing at least 11 contiguous nucleotides andencoding at least one stop codon in each possible reading frame on thestrand. Such a stop-codon-containing polynucleotide, when integrated orinserted at the site of a DSB in a genome, disrupts or hinderstranslation of an encoded amino acid sequence. In embodiments, thepolynucleotide is a double-stranded DNA or double-stranded DNA/RNAhybrid molecule including at least 18 contiguous base-pairs and encodingat least one stop codon in each possible reading frame on either strand;in embodiments, the polynucleotide is a double-stranded DNA ordouble-stranded DNA/RNA hybrid molecule that is blunt-ended; in otherembodiments, the polynucleotide is a double-stranded DNA ordouble-stranded DNA/RNA hybrid molecule that has one or more overhangsor unpaired nucleotides at one or both termini. In embodiments, thepolynucleotide is double-stranded and includes between about 18 to about300 nucleotides on each strand. In embodiments, the polynucleotide is asingle-stranded DNA or a single-stranded DNA/RNA hybrid moleculeincluding at least 11 contiguous nucleotides and encoding at least onestop codon in each possible reading frame on the strand. In embodiments,the polynucleotide is single-stranded and includes between 11 and about300 contiguous nucleotides in the strand.

In embodiments, the polynucleotide for disrupting gene expressionfurther includes a nucleotide sequence that provides a useful functionwhen integrated into the site of a DSB in a genome. For example, invarious non-limiting embodiments the polynucleotide further includes:sequence that is recognizable by a specific binding agent or that bindsto a specific molecule or encodes an RNA molecule or an amino acidsequence that binds to a specific molecule, or sequence that isresponsive to a specific change in the physical environment or encodesan RNA molecule or an amino acid sequence that is responsive to aspecific change in the physical environment, or heterologous sequence,or sequence that serves to stop transcription at the site of the DSB, orsequence having secondary structure (e. g., double-stranded stems orstem-loops) or than encodes a transcript having secondary structure (e.g., double-stranded RNA that is cleavable by a Dicer-type ribonuclease).

In an embodiment, the polynucleotide for disrupting gene expression is adouble-stranded DNA or a double-stranded DNA/RNA hybrid molecule,wherein each strand of the polynucleotide includes at least 18 and fewerthan 200 contiguous base-pairs, wherein the number of base-pairs is notdivisible by 3, and wherein each strand encodes at least one stop codonin each possible reading frame in the 5′ to 3′ direction. In anembodiment, the polynucleotide is a double-stranded DNA or adouble-stranded DNA/RNA hybrid molecule, wherein the polynucleotideincludes at least one phosphorothioate modification.

Related aspects include larger polynucleotides such as a plasmid,vector, or chromosome including the polynucleotide for disrupting geneexpression, as well as polymerase primers for amplification of thepolynucleotide for disrupting gene expression.

Methods of Identifying the Locus of a Double-Stranded Break

In another aspect, the invention provides a method of identifying thelocus of at least one double-stranded break (DSB) in genomic DNA in acell (such as a plant cell or plant protoplast) including the genomicDNA, the method including: (a) contacting the genomic DNA having a DSBwith a polynucleotide (such as a double-stranded DNA, a single-strandedDNA, a single-stranded DNA/RNA hybrid, or a double-stranded DNA/RNAhybrid) molecule, wherein the polynucleotide donor molecule is capableof being integrated (or having its sequence integrated) at the DSB andhas a length of at least 2, at least 3, at least 4, at least 5, at least6, at least 7, at least 8, at least 9, at least 10, or at least 11base-pairs if double-stranded (or nucleotides if single-stranded), orbetween about 2 to about 320 base-pairs if double-stranded (ornucleotides if single-stranded), or between about 2 to about 500base-pairs if double-stranded (or nucleotides if single-stranded), orbetween about 5 to about 500 base-pairs if double-stranded (ornucleotides if single-stranded), or between about 5 to about 300base-pairs if double-stranded (or nucleotides if single-stranded), orbetween about 11 to about 300 base-pairs if double-stranded (ornucleotides if single-stranded), or about 18 to about 300 base-pairs ifdouble-stranded (or nucleotides if single-stranded), or between about 30to about 100 base-pairs if double-stranded (or nucleotides ifsingle-stranded); wherein sequence encoded by the polynucleotide donormolecule, if integrated at the DSB, forms a heterologous insertion; and(b) using at least part of the sequence encoded by the polynucleotidemolecule as a target for PCR primers to allow amplification of DNA inthe locus of the double-stranded break. In embodiments, the genomic DNAis that of a nucleus, mitochondrion, or plastid. In embodiments, the DSBlocus is identified by amplification using primers specific for DNAsequence encoded by the polynucleotide molecule alone; in otherembodiments, the DSB locus is identified by amplification using primersspecific for a combination of DNA sequence encoded by the polynucleotidedonor molecule and genomic DNA sequence flanking the DSB. Suchidentification using a heterologously integrated DNA sequence (i. e.,that encoded by the polynucleotide molecule) is useful, e. g., todistinguish a cell (such as a plant cell or plant protoplast) containingsequence encoded by the polynucleotide molecule integrated at the DSBfrom a cell that does not. Identification of an edited genome from anon-edited genome is important for various purposes, e. g., forcommercial or regulatory tracking of cells or biological material suchas plants or seeds containing an edited genome.

In a related aspect, the invention provides a method of identifying thelocus of double-stranded breaks (DSBs) in genomic DNA in a pool of cells(such as a pool of plant cells or plant protoplasts), wherein the poolof cells includes cells having genomic DNA with sequence encoded by apolynucleotide (such as a double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, or a double-stranded DNA/RNA hybrid)donor molecule inserted at the locus of the double stranded breaks;wherein the polynucleotide donor molecule is capable of being integrated(or having its sequence integrated) at the DSB and has a length of atleast 2, at least 3, at least 4, at least 5, at least 6, at least 7, atleast 8, at least 9, at least 10, or at least 11 base-pairs ifdouble-stranded (or nucleotides if single-stranded), or between about 2to about 320 base-pairs if double-stranded (or nucleotides ifsingle-stranded), or between about 2 to about 500 base-pairs ifdouble-stranded (or nucleotides if single-stranded), or between about 5to about 500 base-pairs if double-stranded (or nucleotides ifsingle-stranded), or between about 5 to about 300 base-pairs ifdouble-stranded (or nucleotides if single-stranded), or between about 11to about 300 base-pairs if double-stranded (or nucleotides ifsingle-stranded), or about 18 to about 300 base-pairs if double-stranded(or nucleotides if single-stranded), or between about 30 to about 100base-pairs if double-stranded (or nucleotides if single-stranded);wherein sequence encoded by the polynucleotide donor molecule, ifintegrated at the DSB, forms a heterologous insertion; wherein thesequence encoded by the polynucleotide molecule is used as a target forPCR primers to allow amplification of DNA in the region of thedouble-stranded breaks. In embodiments, the genomic DNA is that of anucleus, mitochondrion, or plastid. In embodiments, the pool of cells isa population of plant cells or plant protoplasts, wherein the populationof plant cells or plant protoplasts include multiple different DSBs (e.g., induced by different guide RNAs) in the genome. In embodiments, eachDSB locus is identified by amplification using primers specific for DNAsequence encoded by the polynucleotide molecule alone; in otherembodiments, each DSB locus is identified by amplification using primersspecific for a combination of DNA sequence encoded by the polynucleotidemolecule and genomic DNA sequence flanking the DSB. Such identificationusing a heterologously integrated DNA sequence (i. e., sequence encodedby the polynucleotide molecule) is useful, e. g., to identify a cell(such as a plant cell or plant protoplast) containing sequence encodedby the polynucleotide molecule integrated at a DSB from a cell that doesnot.

In embodiments, the pool of cells is a pool of isolated plant cells orplant protoplasts in liquid or suspension culture, or cultured in or onsemi-solid or solid media. In embodiments, the pool of cells is a poolof plant cells or plant protoplasts encapsulated in a polymer or otherencapsulating material, enclosed in a vesicle or liposome, or embeddedin or attached to a matrix or other solid support (e. g., beads ormicrobeads, membranes, or solid surfaces). In embodiments, the pool ofcells is a pool of plant cells or plant protoplasts encapsulated in apolysaccharide (e. g., pectin, agarose). In embodiments, the pool ofcells is a pool of plant cells located in a plant, plant part, or planttissue, and the cells are optionally isolated from the plant, plantpart, or plant tissue in a step following the integration of apolynucleotide at a DSB.

In embodiments, the polynucleotide donor molecule that is integrated (orhas sequence that is integrated) at the DSB is double-stranded andblunt-ended; in other embodiments the polynucleotide donor molecule isdouble-stranded and has an overhang or “sticky end” consisting ofunpaired nucleotides (e. g., 1, 2, 3, 4, 5, or 6 unpaired nucleotides)at one terminus or both termini. In an embodiment, the polynucleotidedonor molecule that is integrated (or has sequence that is integrated)at the DSB is a double-stranded DNA or double-stranded DNA/RNA hybridmolecule of about 18 to about 300 base-pairs, or about 20 to about 200base-pairs, or about 30 to about 100 base-pairs, and having at least onephosphorothioate bond between adjacent nucleotides at a 5′ end, 3′ end,or both 5′ and 3′ ends. In embodiments, the polynucleotide donormolecule includes single strands of at least 11, at least 18, at least20, at least 30, at least 40, at least 60, at least 80, at least 100, atleast 120, at least 140, at least 160, at least 180, at least 200, atleast 240, at least 280, or at least 320 nucleotides. In embodiments,the polynucleotide donor molecule has a length of at least 2, at least3, at least 4, at least 5, at least 6, at least 7, at least 8, at least9, at least 10, or at least 11 base-pairs if double-stranded (ornucleotides if single-stranded), or between about 2 to about 320base-pairs if double-stranded (or nucleotides if single-stranded), orbetween about 2 to about 500 base-pairs if double-stranded (ornucleotides if single-stranded), or between about 5 to about 500base-pairs if double-stranded (or nucleotides if single-stranded), orbetween about 5 to about 300 base-pairs if double-stranded (ornucleotides if single-stranded), or between about 11 to about 300base-pairs if double-stranded (or nucleotides if single-stranded), orabout 18 to about 300 base-pairs if double-stranded (or nucleotides ifsingle-stranded), or between about 30 to about 100 base-pairs ifdouble-stranded (or nucleotides if single-stranded). In embodiments, thepolynucleotide donor molecule includes chemically modified nucleotides;in embodiments, the naturally occurring phosphodiester backbone of thepolynucleotide donor molecule is partially or completely modified withphosphorothioate, phosphorodithioate, or methylphosphonateinternucleotide linkage modifications, or the polynucleotide donormolecule includes modified nucleoside bases or modified sugars, or thepolynucleotide donor molecule is labelled with a fluorescent moiety orother detectable label. In an embodiment, the polynucleotide donormolecule is double-stranded and is perfectly base-paired through all ormost of its length, with the possible exception of any unpairednucleotides at either terminus or both termini. In another embodiment,the polynucleotide donor molecule is double-stranded and includes one ormore non-terminal mismatches or non-terminal unpaired nucleotides withinthe otherwise double-stranded duplex. In related embodiments, thepolynucleotide donor molecule that is integrated at the DSB is asingle-stranded DNA or a single-stranded DNA/RNA hybrid. Additionaldescription of the polynucleotide donor molecule is found above in theparagraphs following the heading “Polynucleotide Molecules”.

In embodiments, the polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule that is integrated at theDSB includes a nucleotide sequence that, if integrated (or has sequencethat is integrated) at the DSB, forms a heterologous insertion that isnot normally found in the genome. In embodiments, sequence encoded bythe polynucleotide molecule that is integrated at the DSB includes anucleotide sequence that does not normally occur in the genomecontaining the DSB; this can be established by sequencing of the genome,or by hybridization experiments. In certain embodiments, sequenceencoded by the polynucleotide molecule, when integrated at the DSB, notonly permits identification of the locus of the DSB, but also imparts afunctional trait to the cell including the genomic DNA, or to anorganism including the cell; in non-limiting examples, sequence encodedby the polynucleotide molecule that is integrated at the DSB includes atleast one of the nucleotide sequences selected from the group consistingof:

-   -   (a) DNA encoding at least one stop codon, or at least one stop        codon on each strand, or at least one stop codon within each        reading frame on each strand;    -   (b) DNA encoding heterologous primer sequence (e. g., a sequence        of about 18 to about 22 contiguous nucleotides, or of at least        18, at least 20, or at least 22 contiguous nucleotides that can        be used to initiate DNA polymerase activity at the site of the        DSB);    -   (c) DNA encoding a unique identifier sequence (e. g., a sequence        that when inserted at the DSB creates a heterologous sequence        that can be used to identify the presence of the insertion);    -   (d) DNA encoding a transcript-stabilizing sequence;    -   (e) DNA encoding a transcript-destabilizing sequence;    -   (f) a DNA aptamer or DNA encoding an RNA aptamer or amino acid        aptamer; and    -   (g) DNA that includes or encodes a sequence recognizable by a        specific binding agent.        Methods of Identifying the Nucleotide Sequence of a Locus in the        Genome that is Associated with a Phenotype

In another aspect, the invention provides a method of identifying thenucleotide sequence of a locus in the genome that is associated with aphenotype, the method including the steps of:

-   -   (a) providing to a population of cells (such as plant cells or        plant protoplasts) having the genome:        -   (i) multiple different guide RNAs (gRNAs) to induce multiple            different double strand breaks (DSBs) in the genome, wherein            each DSB is produced by an RNA-guided nuclease guided to a            locus on the genome by one of the gRNAs, and        -   (ii) polynucleotide (such as a double-stranded DNA, a            single-stranded DNA, a single-stranded DNA/RNA hybrid, or a            double-stranded DNA/RNA hybrid) donor molecules having a            defined nucleotide sequence, wherein the polynucleotide            molecules are capable of being integrated (or have sequence            that is integrated) into the DSBs by non-homologous            end-joining (NHEJ);    -   whereby when sequence encoded by at least some of the        polynucleotide molecules are inserted into at least some of the        DSBs, a genetically heterogeneous population of cells is        produced;    -   (b) selecting from the genetically heterogeneous population of        cells a subset of cells that exhibit a phenotype of interest;    -   (c) using a pool of PCR primers that bind to sequence encoded by        the polynucleotide molecules to amplify from the subset of cells        DNA from the locus of a DSB into which sequence encoded by one        of the polynucleotide molecules has been inserted; and    -   (d) sequencing the amplified DNA to identify the locus        associated with the phenotype of interest.

In embodiments, the cells are plant cells or plant protoplasts or algalcells. In embodiments, the genetically heterogeneous population of cellsundergoes one or more doubling cycles; for example, the population ofcells is provided with growth conditions that should normally result incell division, and at least some of the cells undergo one or moredoublings. In embodiments, the genetically heterogeneous population ofcells is subjected to conditions permitting expression of the phenotypeof interest. In embodiments, the cells are provided in a single pool orpopulation (e. g., in a single container); in other embodiments, thecells are provided in an arrayed format (e. g., in microwell plates orin droplets in a microfluidics device or attached individually toparticles or beads).

In embodiments, the RNA-guided nuclease or a polynucleotide that encodesthe RNA-guided nuclease is exogenously provided to the population ofcells. In embodiments, each gRNA is provided as a polynucleotidecomposition including: (a) a CRISPR RNA (crRNA) that includes the gRNA,or a polynucleotide that encodes a crRNA, or a polynucleotide that isprocessed into a crRNA; or (b) a single guide RNA (sgRNA) that includesthe gRNA, or a polynucleotide that encodes a sgRNA, or a polynucleotidethat is processed into a sgRNA In embodiments, the multiple guide RNAsare provided as ribonucleoproteins (e. g., Cas9 nuclease moleculescomplexed with different gRNAs to form different RNPs). In embodiments,each gRNA is provided as a ribonucleoprotein (RNP) including theRNA-guided nuclease and an sgRNA. In embodiments, multiple guide RNAsare provided, as well as a single polynucleotide donor molecule having asequence to be integrated at the resulting DSBs; in other embodiments,multiple guide RNAs are provided, as well as different polynucleotidedonor molecules having a sequence to be integrated at the resultingmultiple DSBs.

In another embodiment, a detection method is provided for identifying aplant as having been subjected to genomic modification according to atargeted modification method described herein, where that modificationmethod yields a low frequency of off-target mutations. The detectionmethod comprises a step of identifying the off-target mutations (e.g.,an insertion of a non-specific sequence, a deletion, or an indelresulting from the use of the targeting agents, or insertions of part orall of a sequence encoded by one or more polynucleotide donor moleculesat one or more coding or non-coding loci in a genome). In a relatedembodiment, the detection method is used to track of movement of a plantcell or plant or product thereof through a supply chain. The presence ofsuch an identified mutation in a processed product or commodity productis de facto evidence that the product contains or is derived from aplant cell, plant, or seed of this invention. In related embodiments,the presence of the off-target mutations are identified using PCR, achip-based assay, probes specific for the donor sequences, or any othertechnique known in the art to be useful for detecting the presence ofparticular nucleic acid sequences.

EXAMPLES Example 1

This example illustrates techniques for preparing a plant cell or plantprotoplast useful in compositions and methods of the invention, forexample, in providing a reaction mixture including a plant cell having adouble-strand break (DSB) at least one locus in its genome. Morespecifically this non-limiting example describes techniques forpreparing isolated, viable plant protoplasts from monocot and dicotplants.

The following mesophyll protoplast preparation protocol (modified fromone publicly available atmolbio[dot]mgh[dot]harvard.edu/sheenweb/protocols_reg[dot]html) isgenerally suitable for use with monocot plants such as maize (Zea mays)and rice (Oryza sativa):

Prepare an enzyme solution containing 0.6 molar mannitol, 10 millimolarMES pH 5.7, 1.5% cellulase R10, and 0.3% macerozyme R10. Heat the enzymesolution at 50-55 degrees Celsius for 10 minutes to inactivate proteasesand accelerate enzyme solution and cool it to room temperature beforeadding 1 millimolar CaCl₂), 5 millimolar β-mercaptoethanol, and 0.1%bovine serum albumin. Pass the enzyme solution through a 0.45 micrometerfilter. Prepare a washing solution containing 0.6 molar mannitol, 4millimolar MES pH 5.7, and 20 millimolar KCl.

Obtain second leaves of the monocot plant (e. g., maize or rice) and cutout the middle 6-8 centimeters. Stack ten leaf sections and cut into 0.5millimeter-wide strips without bruising the leaves. Submerge the leafstrips completely in the enzyme solution in a petri dish, cover withaluminum foil, and apply vacuum for 30 minutes to infiltrate the leaftissue. Transfer the dish to a platform shaker and incubate for anadditional 2.5 hours' digestion with gentle shaking (40 rpm). Afterdigestion, carefully transfer the enzyme solution (now containingprotoplasts) using a serological pipette through a 35 micrometer nylonmesh into a round-bottom tube; rinse the petri with 5 milliliters ofwashing solution and filter this through the mesh as well. Centrifugethe protoplast suspension at 1200 rpm, 2 minutes in a swing-bucketcentrifuge. Aspirate off as much of the supernatant as possible withouttouching the pellet; gently wash the pellet once with 20 milliliterswashing buffer and remove the supernatant carefully. Gently resuspendthe pellet by swirling in a small volume of washing solution, thenresuspend in 10-20 milliliters of washing buffer. Place the tube uprighton ice for 30 minutes-4 hours (no longer). After resting on ice, removethe supernatant by aspiration and resuspend the pellet with 2-5milliliters of washing buffer. Measure the concentration of protoplastsusing a hemocytometer and adjust the concentration to 2×10{circumflexover ( )}5 protoplasts/milliliter with washing buffer.

The following mesophyll protoplast preparation protocol (modified fromone described by Niu and Sheen (2012) Methods Mol. Biol., 876:195-206,doi: 10.1007/978-1-61779-809-2_16) is generally suitable for use withdicot plants such as Arabidopsis thaliana and brassicas such as kale(Brassica oleracea).

Prepare an enzyme solution containing 0.4 M mannitol, 20 millimolar KCl,20 millimolar MES pH 5.7, 1.5% cellulase R10, and 0.4% macerozyme R10.Heat the enzyme solution at 50-55 degrees Celsius for 10 minutes toinactivate proteases and accelerate enzyme solution, and then cool it toroom temperature before adding 10 millimolar CaCl₂, 5 millimolarβ-mercaptoethanol, and 0.1% bovine serum albumin. Pass the enzymesolution through a 0.45 micrometer filter. Prepare a “W5” solutioncontaining 154 millimolar NaCl, 125 millimolar CaCl₂, 5 millimolar KCl,and 2 millimolar MES pH 5.7. Prepare a “MMg solution” solutioncontaining 0.4 molar mannitol, 15 millimolar MgCl₂, and 4 millimolar MESpH 5.7.

Obtain second or third pair true leaves of the dicot plant (e. g., abrassica such as kale) and cut out the middle section. Stack 4-8 leafsections and cut into 0.5 millimeter-wide strips without bruising theleaves. Submerge the leaf strips completely in the enzyme solution in apetri dish, cover with aluminum foil, and apply vacuum for 30 minutes toinfiltrate the leaf tissue. Transfer the dish to a platform shaker andincubate for an additional 2.5 hours' digestion with gentle shaking (40rpm). After digestion, carefully transfer the enzyme solution (nowcontaining protoplasts) using a serological pipette through a 35micrometer nylon mesh into a round-bottom tube; rinse the petri dishwith 5 milliliters of washing solution and filter this through the meshas well. Centrifuge the protoplast suspension at 1200 rpm, 2 minutes ina swing-bucket centrifuge. Aspirate off as much of the supernatant aspossible without touching the pellet; gently wash the pellet once with20 milliliters washing buffer and remove the supernatant carefully.Gently resuspend the pellet by swirling in a small volume of washingsolution, then resuspend in 10-20 milliliters of washing buffer. Placethe tube upright on ice for 30 minutes-4 hours (no longer). Afterresting on ice, remove the supernatant by aspiration and resuspend thepellet with 2-5 milliliters of MMg solution. Measure the concentrationof protoplasts using a hemocytometer and adjust the concentration to2×10{circumflex over ( )}5 protoplasts/milliliter with MMg solution.

Example 2

This example illustrates culture conditions effective in improvingviability of plant cells or plant protoplasts. More specifically, thisnon-limiting example describes media and culture conditions forimproving viability of isolated plant protoplasts.

Table 1 provides the compositions of different liquid basal mediasuitable for culturing plant cells or plant protoplasts; final pH of allmedia was adjusted to 5.8 if necessary.

TABLE 1 Concentration (mg/L unless otherwise noted) YPIM Component SH 8pPIM P2 B− Casamino acids 250 Coconut water 20000 Ascorbic acid 2 biotin0.01 0.01 Cholicalciferol (Vitamin D-3) 0.01 choline chloride 1 Citricacid 40 Cyanocobalamin (Vitamin B-12) 0.02 D-calcium pantothenate 1 1D-Cellobiose 250 D-Fructose 250 D-Mannose 250 D-Ribose 250 D-Sorbitol250 D-Xylose 250 folic acid 0.4 0.2 Fumaric acid 40 L-Malic acid 40L-Rhamnose 250 p-Aminobenzoic acid 0.02 Retinol (Vitamin A) 0.01Riboflavin 0.2 Sodium pyruvate 20 2,4-D 0.5 0.2 1 5 16-benzylaminopurine (BAP) 1 Indole-3-butyric acid (IBA) 2.5 Kinetin 0.1Naphthaleneacetic acid (NAA) 1 parachlorophenoxyacetate 2 (pCPA)Thidiazuron 0.022 Zeatin 0.5 AlCl3 0.03 Bromocresol purple 8 CaCl₂•2H₂O200 600 440 200 440 CoCl₂•6H₂O 0.1 0.025 0.1 CuSO₄•5H₂O 0.2 0.025 0.030.2 0.03 D-Glucose 68400 40000 40000 D-Mannitol 52000 250 60000 5200060000 FeSO₄•7H₂O 15 27.8 15 15 15 H₃BO₃ 5 3 1 5 1 KCl 300 KH₂PO₄ 170 170170 KI 1 0.75 0.01 1 0.01 KNO₃ 2500 1900 505 2500 505 MES pH 5.8 (mM)3.586 25 25 MgSO₄•7H₂O 400 300 370 400 370 MnSO₄•H₂O 10 10 0.1 10 0.1Na₂EDTA 20 37.3 20 20 20 Na₂MoO₄•2H₂O 0.1 0.25 0.1 NH₄H₂PO₄ 300 300NH₄NO₃ 600 160 160 NiCl₂•6H₂O 0.03 Sucrose 30000 2500 30000 ZnSO₄•7H₂O 12 1 1 1 Tween-80 (microliter/L) 10 10 Inositol 1000 100 100 1000 100Nicotinamide 1 Nicotinic acid 5 1 5 1 Pyridoxine•HCl 0.5 1 1 0.5 1Thiamine•HCl 5 1 1 5 1 * Sources for basal media: SH—Schenk andHildebrandt, Can. J. Bot. 50: 199 (1971). 8p—Kao and Michayluk, Planta126: 105 (1975). P2—SH but with hormones from Potrykus et al., Mol. Gen.Genet. 156: 347 (1977). PIM—Chupeau et al., The Plant Cell 25: 2444(2013).

Example 3

This example illustrates culture conditions effective in improvingviability of plant cells or plant protoplasts. More specifically, thisnon-limiting example describes methods for encapsulating isolated plantprotoplasts.

When protoplasts are encapsulated in alginate or pectin, they remainintact far longer than they would in an equivalent liquid medium. Inorder to encapsulate protoplasts, a liquid medium (“calcium base”) isprepared that is in all other respects identical to the final desiredrecipe with the exception that the calcium (usually CaCl2·2H2O) isincreased to 80 millimolar. A second medium (“encapsulation base”) isprepared that has no added calcium but contains 10 g/L of theencapsulation agent, e. g., by making a 20 g/L solution of theencapsulation agent and adjusting its pH with KOH or NaOH until it isabout 5.8, making a 2× solution of the final medium (with no calcium),then combining these two solutions in a 1:1 ratio. Encapsulation agentsinclude alginate (e. g., alginic acid from brown algae, catalogue numberA0682, Sigma-Aldrich, St. Louis, Mo.) and pectin (e. g., pectin fromcitrus peel, catalogue number P9136, Sigma-Aldrich, St. Louis, Mo.;various pectins including non-amidated low-methoxyl pectin, cataloguenumber 1120-50 from Modernist Pantry, Portsmouth, N.H.). The solutions,including the encapsulation base solution, is filter-sterilized througha series of filters, with the final filter being a 0.2-micrometerfilter. Protoplasts are pelleted by gentle centrifugation andresuspended in the encapsulation base; the resulting suspension is addeddropwise to the calcium base, upon which the protoplasts are immediatelyencapsulated in solid beads.

Example 4

This example illustrates culture conditions effective in improvingviability of plant cells or plant protoplasts. More specifically, thisnon-limiting example describes observations of effects on protoplastviability obtained by adding non-conventionally high levels of divalentcations to culture media.

Typical plant cell or plant protoplast media contain between about 2 toabout 4 millimolar calcium cations and between about 1-1.5 millimolarmagnesium cations. In the course of experiments varying and addingcomponents to media, it was discovered that the addition ofnon-conventionally high levels of divalent cations had a surprisinglybeneficial effect on plant cell or plant protoplast viability.Beneficial effects on plant protoplast viability begin to be seen whenthe culture medium contains about 30 millimolar calcium cations (e. g.,as calcium chloride) or about 30 millimolar magnesium cations (e. g., asmagnesium chloride). Even higher levels of plant protoplast viabilitywere observed with increasing concentrations of calcium or magnesiumcations, i. e., at about 40 millimolar or about 50 millimolar calcium ormagnesium cations. The result of several titration experiments indicatedthat greatest improvement in protoplast viability was seen using mediacontaining between about 50 to about 100 millimolar calcium cations or50 to about 100 millimolar magnesium cations; no negative effects onprotoplast viability or physical appearance was observed at these highcation levels. This was observed in multiple experiments usingprotoplasts obtained from several plant species including maize(multiple germplasms, e. g., B73, A188, B104, HiIIA, HiIIB, BMS), rice,wheat, soy, kale, and strawberry; improved protoplast viability wasobserved in both encapsulated protoplasts and non-encapsulatedprotoplasts. Addition of potassium chloride at the same levels had noeffect on protoplast viability. It is possible that inclusion ofslightly lower (but still non-conventionally high) levels of divalentcations (e. g., about 10 millimolar, about 15 millimolar, about 20millimolar, or about 25 millimolar calcium cations or magnesium cations)in media is beneficial for plant cells or plant protoplasts ofadditional plant species.

Example 5

This example illustrates culture conditions effective in improvingviability of plant cells or plant protoplasts. More specifically, thisnon-limiting example describes observations of effects on maize,soybean, and strawberry protoplast viability obtained by addingnon-conventionally high levels of divalent cations to culture media.

Separate suspensions of maize B73, winter wheat, soy, and strawberryprotoplasts (2×10{circumflex over ( )}5 cells per milliliter) wereprepared in YPIM B-liquid medium containing calcium chloride at 0, 50,or 100 millimolar. One-half milliliter aliquots of the suspensions weredispensed into a 24-well microtiter plate.

Viability at day 8 of culture was judged by visualization under a lightmicroscope. At this point, the viability of the maize protoplasts in the0, 50, and 100 millimolar calcium conditions was 10%, 30%, and 80%,respectively. There were no large differences observed at this timepoint for protoplasts of the other species.

Viability at day 13 was judged by Evans blue staining and visualizationunder a light microscope. At this point, the viability of the maizeprotoplasts in the 0, 50, and 100 millimolar calcium conditions was 0%,0%, and 10%, respectively; viability of the soybean protoplasts in the0, 50, and 100 millimolar calcium conditions was 0%, 50%, and 50%,respectively; and viability of the maize protoplasts in the 0 and 50millimolar calcium conditions was 0% and 50%, respectively (viabilitywas not measured for the 100 millimolar condition). These resultsdemonstrate that culture conditions including calcium cations at 50 or100 millimolar improved viability of both monocot and dicot protoplastsover a culture time of ˜13 days.

Example 6

This example illustrates a method of delivery of an effector molecule toa plant cell or plant protoplast to effect a genetic change, in thiscase introduction of a double-strand break in the genome. Morespecifically, this non-limiting example describes a method of deliveringa guide RNA (gRNA) in the form of a ribonucleoprotein (RNP) to isolatedplant protoplasts.

The following delivery protocol (modified from one publicly available atmolbio[dot]mgh[dot]harvard.edu/sheenweb/protocols_reg[dot]html) isgenerally suitable for use with monocot plants such as maize (Zea mays)and rice (Oryza sativa):

Prepare a polyethylene glycol (PEG) solution containing 40% PEG 4000,0.2 molar mannitol, and 0.1 molar CaCl₂). Prepare an incubation solutioncontaining 170 milligram/liter KH₂PO₄, 440 milligram/liter CaCl2.2H₂O,505 milligram/liter KNOB, 160 milligram/liter NH₄NO₃, 370milligram/liter MgSO₄·7H₂O, 0.01 milligram/liter KI, 1 milligram/literH₃BO₃, 0.1 milligram/liter MnSO₄·4H₂O, 1 milligram/liter ZnSO₄·7H₂O,0.03 milligram/liter CuSO₄·5H₂O, 1 milligram/liter nicotinic acid, 1milligram/liter thiamine HCl, 1 milligram/liter pyridoxine HCl, 0.2milligram/liter folic acid, 0.01 milligram/liter biotin, 1milligram/liter D-Ca-pantothenate, 100 milligram/liter myo-inositol, 40grams/liter glucose, 60 grams/liter mannitol, 700 milligram/liter MES,10 microliter/liter Tween 80, 1 milligram/liter 2,4-D, and 1milligram/liter 6-benzylaminopurine (BAP); adjust pH to 5.6.

Prepare a crRNA:tracrRNA or guide RNA (gRNA) complex by mixing equalamounts of CRISPR crRNA and tracrRNA (obtainable e. g., ascustom-synthesized Alt-RTM CRISPR crRNA and tracrRNA oligonucleotidesfrom Integrated DNA Technologies, Coralville, Iowa): mix 6 microlitersof 100 micromolar crRNA and 6 microliters of 100 micromolar tracrRNA,heat at 95 degrees Celsius for 5 minutes, and then cool thecrRNA:tracrRNA complex to room temperature. To the cooled gRNA solution,add 10 micrograms Cas9 nuclease (Aldevron, Fargo, N. Dak.) and incubate5 minutes at room temperature to allow the ribonucleoprotein (RNP)complex to form. Add the RNP solution to 100 microliters of monocotprotoplasts (prepared as described in Example 1) in a microfuge tube;add 5 micrograms salmon sperm DNA (VWR Cat. No.: 95037-160) and an equalvolume of the PEG solution. Mix gently by tapping. After 5 minutes,dilute with 880 microliters of washing buffer and mix gently byinverting the tube. Centrifuge 1 minute at 1200 rpm and then remove thesupernatant. Resuspend the protoplasts in 1 milliliter incubationsolution and transfer to a multi-well plate. The efficiency of genomeediting is assessed by any suitable method such as heteroduplex cleavageassay or by sequencing, as described elsewhere in this disclosure.

The following delivery protocol (modified from one described by Niu andSheen (2012) Methods Mol. Biol., 876:195-206, doi:10.1007/978-1-61779-809-2_16) is generally suitable for use with dicotplants such as Arabidopsis thaliana and brassicas such as kale (Brassicaoleracea):

Prepare a polyethylene glycol (PEG) solution containing 40% PEG 4000,0.2 molar mannitol, and 0.1 molar CaCl₂). Prepare an incubation solutioncontaining 170 milligram/liter KH₂PO₄, 440 milligram/liter CaCl₂·2H₂O,505 milligram/liter KNO₃, 160 milligram/liter NH₄NO₃, 370milligram/liter MgSO₄·7H₂O, 0.01 milligram/liter KI, 1 milligram/literH₃BO₃, 0.1 milligram/liter MnSO₄·4H₂O, 1 milligram/liter ZnSO₄·7H₂O,0.03 milligram/liter CuSO₄·5H₂O, 1 milligram/liter nicotinic acid, 1milligram/liter thiamine HCl, 1 milligram/liter pyridoxine HCl, 0.2milligram/liter folic acid, 0.01 milligram/liter biotin, 1milligram/liter D-Ca-pantothenate, 100 milligram/liter myo-inositol, 40grams/liter glucose, 60 grams/liter mannitol, 700 milligram/liter MES,10 microliter/liter Tween 80, 1 milligram/liter 2,4-D, and 1milligram/liter 6-benzylaminopurine (BAP); adjust pH to 5.6.

Prepare a crRNA:tracrRNA or guide RNA (gRNA) complex by mixing equalamounts of CRISPR crRNA and tracrRNA (obtainable e. g., ascustom-synthesized Alt-RTM CRISPR crRNA and tracrRNA oligonucleotidesfrom Integrated DNA Technologies, Coralville, Iowa): mix 6 microlitersof 100 micromolar crRNA and 6 microliters of 100 micromolar tracrRNA,heat at 95 degrees Celsius for 5 minutes, and then cool thecrRNA:tracrRNA complex to room temperature. To the cooled gRNA solution,add 10 micrograms Cas9 nuclease (Aldevron, Fargo, N. Dak.) and incubate5 minutes at room temperature to allow the ribonucleoprotein (RNP)complex to form. Add the RNP solution to 100 microliters of dicotprotoplasts (prepared as described in Example 1) in a microfuge tube;add 5 micrograms salmon sperm DNA (VWR Cat. No.: 95037-160) and an equalvolume of the PEG solution. Mix gently by tapping. After 5 minutes,dilute with 880 microliters of washing buffer and mix gently byinverting the tube. Centrifuge 1 minute at 1200 rpm and then remove thesupernatant. Resuspend the protoplasts in 1 milliliter incubationsolution and transfer to a multi-well plate. The efficiency of genomeediting is assessed by any suitable method such as heteroduplex cleavageassay or by sequencing, as described elsewhere in this disclosure.

The above protocols for delivery of gRNAs as RNPs to plant protoplastsare adapted for delivery of guide RNAs alone to monocot or dicotprotoplasts that express Cas9 nuclease by transient or stabletransformation; in this case, the guide RNA complex is prepared asbefore and added to the protoplasts, but no Cas9 nuclease and no salmonsperm DNA is added. The remainder of the procedures are identical.

Example 7

This example illustrates genome editing in plants and furtherillustrates a method of delivering gene-editing effector molecules intoa plant cell. This example describes introducing at least onedouble-strand break (DSB) in a genome in a plant cell or plantprotoplast, by delivering at least one effector molecules to the plantcell or plant protoplast using at least one physical agent, such as aparticulate, microparticulate, or nanoparticulate. More specifically,this non-limiting example illustrates introducing at least onedouble-strand break (DSB) in a genome in a plant cell or plantprotoplast by contacting the plant cell or plant protoplast with acomposition including at least one sequence-specific nuclease and atleast one physical agent, such as at least one nanocarrier. Embodimentsinclude those wherein the nanocarrier comprises metals (e. g., gold,silver, tungsten, iron, cerium), ceramics (e. g., aluminum oxide,silicon carbide, silicon nitride, tungsten carbide), polymers (e. g.,polystyrene, polydiacetylene, and poly(3,4-ethylenedioxythiophene)hydrate), semiconductors (e. g., quantum dots), silicon (e. g., siliconcarbide), carbon (e. g., graphite, graphene, graphene oxide, or carbonnanosheets, nanocomplexes, or nanotubes), composites (e. g.,polyvinylcarbazole/graphene, polystyrene/graphene, platinum/graphene,palladium/graphene nanocomposites), a polynucleotide, a poly(AT), apolysaccharide (e. g., dextran, chitosan, pectin, hyaluronic acid, andhydroxyethylcellulose), a polypeptide, or a combination of these. Inembodiments, such particulates and nanoparticulates are furthercovalently or non-covalently functionalized, or further includemodifiers or cross-linked materials such as polymers (e. g., linear orbranched polyethylenimine, poly-lysine), polynucleotides (e. g., DNA orRNA), polysaccharides, lipids, polyglycols (e. g., polyethylene glycol,thiolated polyethylene glycol), polypeptides or proteins, and detectablelabels (e. g., a fluorophore, an antigen, an antibody, or a quantumdot). Embodiments include those wherein the nanocarrier is a nanotube, acarbon nanotube, a multi-walled carbon nanotube, or a single-walledcarbon nanotube. Specific nanocarrier embodiments contemplated hereininclude the single-walled carbon nanotubes, cerium oxide nanoparticles(“nanoceria”), and modifications thereof (e. g., with cationic, anionic,or lipid coatings) described in Giraldo et al. (2014) Nature Materials,13:400-409; the single-walled carbon nanotubes and heteropolymercomplexes thereof described in Zhang et al. (2013) Nature Nanotechnol.,8:959-968 (doi:10.1038/NNANO.2013.236); the single-walled carbonnanotubes and heteropolymer complexes thereof described in Wong et al.(2016) Nano Lett., 16:1161-1172; and the various carbon nanotubepreparations described in US Patent Application Publication US2015/0047074 and International Patent Application PCT/US2015/050885(published as WO 2016/044698 and claiming priority to U.S. ProvisionalPatent Application 62/052,767), all of which patent applications areincorporated in their entirety by reference herein. See also, forexample, the various types of particles and nanoparticles, theirpreparation, and methods for their use, e. g., in deliveringpolynucleotides and polypeptides to cells, disclosed in US PatentApplication Publications 2010/0311168, 2012/0023619, 2012/0244569,2013/0145488, 2013/0185823, 2014/0096284, 2015/0040268, 2015/0047074,and 2015/0208663, all of which are incorporated herein by reference intheir entirety.

In these examples, single-walled carbon nanotubes (SWCNT) andmodifications thereof are prepared as described in Giraldo et al. (2014)Nature Materials, 13:400-409; Zhang et al. (2013) Nature Nanotechnol.,8:959-968; Wong et al. (2016) Nano Lett., 16:1161-1172; US PatentApplication Publication US 2015/0047074; and International PatentApplication PCT/US2015/050885 (published as WO 2016/044698). In aninitial experiment, a DNA plasmid encoding green fluorescent protein(GFP) as a reporter is non-covalently complexed with a SWCNT preparationand tested on various plant cell preparations including plant cells insuspension culture, plant callus, plant embryos, intact or half seeds,and shoot apical meristem. Delivery to the plant callus, embryos, seeds,and meristem is by treatment with pressure, centrifugation, bombardment,microinjection, infiltration (e. g., with a syringe), or by directapplication to the surface of the plant tissue. Efficiency of the SWCNTdelivery of GFP across the plant cell wall and the cellular localizationof the GFP signal is evaluated by microscopy.

In another experiment, plasmids encoding Cas9 and at least one guide RNA(gRNA), such as those described in Example 6, are non-covalentlycomplexed with a SWCNT preparation and tested on various plant cellpreparations including plant cells in suspension culture, plant callus,plant embryos, intact or half seeds, and shoot apical meristem. Deliveryto the plant callus, embryos, seeds, and meristem is by treatment withpressure, centrifugation, bombardment, microinjection, infiltration (e.g., with a syringe), or by direct application to the surface of theplant tissue. The gRNA is designed to target the endogenous plant genephytoene desaturase (PDS) for silencing, where PDS silencing produces avisible phenotype (bleaching, or low/no chlorophyll).

In another experiment, RNA encoding Cas9 and at least one guide RNA(gRNA), such as those described in Example 6, are non-covalentlycomplexed with a SWCNT preparation and tested on various plant cellpreparations including plant cells in suspension culture, plant callus,plant embryos, intact or half seeds, and shoot apical meristem. Deliveryto the plant callus, embryos, seeds, and meristem is by treatment withpressure, centrifugation, bombardment, microinjection, infiltration (e.g., with a syringe), or by direct application to the surface of theplant tissue. The gRNA is designed to target the endogenous plant genephytoene desaturase (PDS) for silencing, where PDS silencing produces avisible phenotype (bleaching, or low/no chlorophyll).

In another experiment, a ribonucleoprotein (RNP), prepared bycomplexation of Cas9 nuclease and at least one guide RNA (gRNA), isnon-covalently complexed with a SWCNT preparation and tested on variousplant cell preparations including plant cells in suspension culture,plant callus, plant embryos, intact or half seeds, and shoot apicalmeristem. Delivery to the plant callus, embryos, seeds, and meristem isby treatment with pressure, centrifugation, bombardment, microinjection,infiltration (e. g., with a syringe), or by direct application to thesurface of the plant tissue. The gRNA is designed to target theendogenous plant gene phytoene desaturase (PDS) for silencing, where PDSsilencing produces a visible phenotype (bleaching, or low/nochlorophyll).

One of skill in the art would recognize that the above generalcompositions and procedures can be modified or combined with otherreagents and treatments, such as those described in detail in theparagraphs following the heading “Delivery Methods and Delivery Agents”.In addition, the single-walled carbon nanotubes (SWCNT) andmodifications thereof prepared as described in Giraldo et al. (2014)Nature Materials, 13:400-409; Zhang et al. (2013) Nature Nanotechnol.,8:959-968; Wong et al. (2016) Nano Lett., 16:1161-1172; US PatentApplication Publication US 2015/0047074; and International PatentApplication PCT/US2015/050885 (published as WO 2016/044698) can be usedto prepare complexes with other polypeptides or polynucleotides or acombination of polypeptides and polynucleotides (e. g., with one or morepolypeptides or ribonucleoproteins including at least one functionaldomain selected from the group consisting of: transposase domains,integrase domains, recombinase domains, resolvase domains, invertasedomains, protease domains, DNA methyltransferase domains, DNAhydroxylmethylase domains, DNA demethylase domains, histone acetylasedomains, histone deacetylase domains, nuclease domains, repressordomains, activator domains, nuclear-localization signal domains,transcription-regulatory protein (or transcription complex recruiting)domains, cellular uptake activity associated domains, nucleic acidbinding domains, antibody presentation domains, histone modifyingenzymes, recruiter of histone modifying enzymes, inhibitor of histonemodifying enzymes, histone methyltransferases, histone demethylases,histone kinases, histone phosphatases, histone ribosylases, histonederibosylases, histone ubiquitinases, histone deubiquitinases, histonebiotinases, and histone tail proteases).

Example 8

This example illustrates genome editing in plants and furtherillustrates a method of delivering gene-editing effector molecules intoa plant cell. More specifically, this non-limiting example describesintroducing at least one double-strand break (DSB) in a genome in aplant cell or plant protoplast, by contacting the plant cell or plantprotoplast with a composition including a sequence-specific nucleasecomplexed with a gold nanoparticle.

In embodiments, at least one double-strand break (DSB) is introduced ina genome in a plant cell or plant protoplast, by contacting the plantcell or plant protoplast with a composition that includes acharge-modified sequence-specific nuclease complexed to acharge-modified gold nanoparticle, wherein the complexation isnon-covalent, e. g., through ionic or electrostatic interactions. In anembodiment, a sequence-specific nuclease having at least one regionbearing a positive charge forms a complex with a negatively-charged goldparticle; in another embodiment, a sequence-specific nuclease having atleast one region bearing a negative charge forms a complex with apositively-charged gold particle. Any suitable method can be used formodifying the charge of the nuclease or the nanoparticle, for instance,through covalent modification to add functional groups, or non-covalentmodification (e. g., by coating a nanoparticle with a cationic, anionic,or lipid coating). In embodiments, the sequence-specific nuclease is atype II Cas nuclease having at least one modification selected from thegroup consisting of: (a) modification at the N-terminus with at leastone negatively charged moiety; (b) modification at the N-terminus withat least one moiety carrying a carboxylate functional group; (c)modification at the N-terminus with at least one glutamate residue, atleast one aspartate residue, or a combination of glutamate and aspartateresidues; (d) modification at the C-terminus with a localization signal,transit, or targeting peptide; (e) modification at the C-terminus with anuclear localization signal (NLS), a chloroplast transit peptide (CTP),or a mitochondrial targeting peptide (MTP). In embodiments, the type IICas nuclease is a Cas9 from Streptococcus pyogenes wherein the Cas9 ismodified at the N-terminus with at least one negatively charged moietyand modified at the C-terminus with a nuclear localization signal (NLS),a chloroplast transit peptide (CTP), or a mitochondrial targetingpeptide (MTP). In embodiments, the type II Cas nuclease is a Cas9 fromStreptococcus pyogenes wherein the Cas9 is modified at the N-terminuswith a polyglutamate peptide and modified at the C-terminus with anuclear localization signal (NLS). In embodiments, the gold nanoparticlehas at least one modification selected from the group consisting of: (a)modification with positively charged moieties; (b) modification with atleast one moiety carrying a positively charged amine; (c) modificationwith at least one polyamine; (d) modification with at least one lysineresidue, at least one histidine residue, at least one arginine residue,at least one guanidine, or a combination thereof. Specific embodimentsinclude those wherein: (a) the sequence-specific nuclease is a type IICas nuclease modified at the N-terminus with at least one negativelycharged moiety and modified at the C-terminus with a nuclearlocalization signal (NLS), a chloroplast transit peptide (CTP), or amitochondrial targeting peptide (MTP); and the gold nanoparticle ismodified with at least one positively charged moiety; (b) the type IICas nuclease is a Cas9 from Streptococcus pyogenes modified at theN-terminus with a polyglutamate peptide and modified at the C-terminuswith a nuclear localization signal (NLS); and the gold nanoparticle ismodified with at least one at least one lysine residue, at least onehistidine residue, at least one arginine residue, at least oneguanidine, or a combination thereof; (c) the type II Cas nuclease is aCas9 from Streptococcus pyogenes modified at the N-terminus with apolyglutamate peptide that includes at least 15 glutamate residues andmodified at the C-terminus with a nuclear localization signal (NLS); andwherein the gold nanoparticle is modified with at least one at least onelysine residue, at least one histidine residue, at least one arginineresidue, at least one guanidine, or a combination thereof. In a specificembodiment, at least one double-strand break (DSB) is introduced in agenome in a plant cell or plant protoplast, by contacting the plant cellor plant protoplast with a composition including a sequence-specificnuclease complexed with a gold nanoparticle, wherein thesequence-specific nuclease is a Cas9 from Streptococcus pyogenesmodified at the N-terminus with a polyglutamate peptide that includes atleast 15 glutamate residues and modified at the C-terminus with anuclear localization signal (NLS); and wherein the gold nanoparticle isin the form of cationic arginine gold nanoparticles (ArgNPs), andwherein when the modified Cas9 and the ArgNPs are mixed, self-assemblednanoassemblies are formed as described in Mout et al. (2017) ACS Nano,doi:10.1021/acsnano.6b07600. Other embodiments contemplated hereininclude the various nanoparticle-protein complexes (e. g., amine-bearingnanoparticles complexed with carboxylate-bearing proteins) described inInternational Patent Application PCT/US2016/015711, published asInternational Patent Application Publication WO2016/123514, which claimspriority to U.S. Provisional Patent Applications 62/109,389, 62/132,798,and 62/169,805, all of which patent applications are incorporated intheir entirety by reference herein.

In embodiments, the sequence-specific nuclease is an RNA-guided DNAendonuclease, such as a type II Cas nuclease, and the compositionfurther includes at least one guide RNA (gRNA) for an RNA-guidednuclease, or a DNA encoding a gRNA for an RNA-guided nuclease. Themethod effects the introduction of at least one double-strand break(DSB) in a genome in a plant cell or plant protoplast; in embodiments,the genome is that of the plant cell or plant protoplast; inembodiments, the genome is that of a nucleus, mitochondrion, plastid, orendosymbiont in the plant cell or plant protoplast. In embodiments, theat least one double-strand break (DSB) is introduced into codingsequence, non-coding sequence, or a combination of coding and non-codingsequence. In embodiments, the plant cell or plant protoplast is a plantcell in an intact plant or seedling or plantlet, a plant tissue, seed,embryo, meristem, germline cells, callus, or a suspension of plant cellsor plant protoplasts.

In embodiments, at least one dsDNA molecule is also provided to theplant cell or plant protoplast, and is integrated at the site of atleast one DSB or at the location where genomic sequence is deletedbetween two DSBs. Embodiments include those wherein: (a) the at leastone DSB is two blunt-ended DSBs, resulting in deletion of genomicsequence between the two blunt-ended DSBs, and wherein the dsDNAmolecule is blunt-ended and is integrated into the genome between thetwo blunt-ended DSBs; (b) the at least one DSB is two DSBs, wherein thefirst DSB is blunt-ended and the second DSB has an overhang, resultingin deletion of genomic sequence between the two DSBs, and wherein thedsDNA molecule is blunt-ended at one terminus and has an overhang on theother terminus, and is integrated into the genome between the two DSBs;(c) the at least one DSB is two DSBs, each having an overhang, resultingin deletion of genomic sequence between the two DSBs, and wherein thedsDNA molecule has an overhang at each terminus and is integrated intothe genome between the two DSBs.

In a non-limiting example, self-assembled green fluorescent protein(GFP)/cationic arginine gold nanoparticles (ArgNPs), nanoassemblies areprepared as described in International Patent Application PublicationWO2016/123514. The GFP/ArgNP nanoassemblies are delivered to maizeprotoplasts and to kale protoplasts prepared as described in Example 1,and to protoplasts prepared from the Black Mexican Sweet (BMS) maizecell line. Efficiency of transfection or delivery is assessed byfluorescence microscopy at time points after transfection (30 minutes, 1hour, 3 hours, 6 hours, and overnight).

In a non-limiting example, self-assembled GFP/cationic arginine goldnanoparticles (ArgNPs), nanoassemblies are prepared as described inInternational Patent Application Publication WO2016/123514. TheGFP/ArgNP nanoassemblies are co-incubated with plant cells in suspensionculture. Efficiency of transfection or delivery across the plant cellwall is assessed by fluorescence microscopy at time points aftertransfection (30 minutes, 1 hour, 3 hours, 6 hours, and overnight).

In a non-limiting example, self-assembled GFP/cationic arginine goldnanoparticles (ArgNPs), nanoassemblies are prepared as described inInternational Patent Application Publication WO2016/123514. TheGFP/ArgNP nanoassemblies are further prepared for Biolistics or particlebombardment and thus delivered to plant cells from suspension culturestransferred to semi-solid or solid media, as well as to soybeanembryogenic callus. Efficiency of transfection or delivery across theplant cell wall is assessed by fluorescence microscopy at time pointsafter transfection (30 minutes, 1 hour, 3 hours, 6 hours, andovernight).

In a non-limiting example, self-assembled GFP/cationic arginine goldnanoparticles (ArgNPs), nanoassemblies are prepared as described inInternational Patent Application Publication WO2016/123514. TheGFP/ArgNP nanoassemblies are delivered by infiltration (e. g., usingmild positive pressure or negative pressure) into leaves of Arabidopsisthaliana plants. Efficiency of transfection or delivery across the plantcell wall is assessed by fluorescence microscopy at time points aftertransfection (30 minutes, 1 hour, 3 hours, 6 hours, and overnight).

In a non-limiting example, self-assembled Cas9/ArgNP nanoassemblies areprepared as described in Mout et al. (2017) ACS Nano,doi:10.1021/acsnano.6b07600 or alternatively as described inInternational Patent Application Publication WO2016/123514, by mixing aCas9 from Streptococcus pyogenes modified at the N-terminus with apolyglutamate peptide that includes at least 15 glutamate residues andmodified at the C-terminus with a nuclear localization signal (NLS) withcationic arginine gold nanoparticles (ArgNPs). The Cas9/ArgNPnanoassemblies are delivered to maize protoplasts or to kale protoplastsprepared as described in Example 1, and to protoplasts prepared from theBlack Mexican Sweet (BMS) maize cell line. In one variation of theprocedure, the Cas9/ArgNP nanoassemblies are co-delivered with at leastone guide RNA (such as those described in Examples, 4, 5, 8, 9, 10, 12,and 13) to the protoplasts. In other variations of the procedure, theself-assembled Cas9/ArgNP nanoassemblies are prepared with at least oneguide RNA to allow the modified Cas9 to form a ribonucleoprotein (RNP)either prior to or after formation of the nanoassemblies; theself-assembled RNP/ArgNP nanoassemblies are then delivered to theprotoplasts. Efficiency of editing is assessed by any suitable methodsuch as a heteroduplex cleavage assay or by sequencing, as describedelsewhere in this disclosure.

In a non-limiting example, self-assembled Cas9/ArgNP nanoassemblies areprepared as described in Mout et al. (2017) ACS Nano,doi:10.1021/acsnano.6b07600 or alternatively as described inInternational Patent Application Publication WO2016/123514, by mixing aCas9 from Streptococcus pyogenes modified at the N-terminus with apolyglutamate peptide that includes at least 15 glutamate residues andmodified at the C-terminus with a nuclear localization signal (NLS) withcationic arginine gold nanoparticles (ArgNPs). The Cas9/ArgNPnanoassemblies are co-incubated with plant cells in suspension culture.In one variation of the procedure, the Cas9/ArgNP nanoassemblies areco-delivered with at least one guide RNA (such as those described inExamples, 4, 5, 8, 9, 10, 12, and 13) to the plant cells in suspensionculture. In other variations of the procedure, the self-assembledCas9/ArgNP nanoassemblies are prepared with at least one guide RNA toallow the modified Cas9 to form a ribonucleoprotein (RNP) either priorto or after formation of the nanoassemblies; the self-assembledRNP/ArgNP nanoassemblies are then delivered to the plant cells insuspension culture. Efficiency of editing is assessed by any suitablemethod such as a heteroduplex cleavage assay or by sequencing, asdescribed elsewhere in this disclosure.

In a non-limiting example, self-assembled Cas9/ArgNP nanoassemblies areprepared as described in Mout et al. (2017) ACS Nano,doi:10.1021/acsnano.6b07600 or alternatively as described inInternational Patent Application Publication WO2016/123514, by mixing aCas9 from Streptococcus pyogenes modified at the N-terminus with apolyglutamate peptide that includes at least 15 glutamate residues andmodified at the C-terminus with a nuclear localization signal (NLS) withcationic arginine gold nanoparticles (ArgNPs). The Cas9/ArgNPnanoassemblies are further prepared for Biolistics or particlebombardment and thus delivered to plant cells from suspension culturestransferred to semi-solid or solid media, as well as to soybeanembryogenic callus. In one variation of the procedure, the Cas9/ArgNPnanoassemblies are co-delivered with at least one guide RNA (such asthose described in Examples, 4, 5, 8, 9, 10, 12, and 13) to the plantcells or callus. In other variations of the procedure, theself-assembled Cas9/ArgNP nanoassemblies are prepared with at least oneguide RNA to allow the modified Cas9 to form a ribonucleoprotein (RNP)either prior to or after formation of the nanoassemblies; theself-assembled RNP/ArgNP nanoassemblies are then delivered to the plantcells or callus. Efficiency of editing is assessed by any suitablemethod such as a heteroduplex cleavage assay or by sequencing, asdescribed elsewhere in this disclosure.

In a non-limiting example, self-assembled Cas9/ArgNP nanoassemblies areprepared as described in Mout et al. (2017) ACS Nano,doi:10.1021/acsnano.6b07600 or alternatively as described inInternational Patent Application Publication WO2016/123514, by mixing aCas9 from Streptococcus pyogenes modified at the N-terminus with apolyglutamate peptide that includes at least 15 glutamate residues andmodified at the C-terminus with a nuclear localization signal (NLS) withcationic arginine gold nanoparticles (ArgNPs). The Cas9/ArgNPnanoassemblies are delivered by infiltration (e. g., using mild positivepressure or negative pressure) into leaves of Arabidopsis thalianaplants. In one variation of the procedure, the Cas9/ArgNP nanoassembliesare co-delivered with at least one guide RNA (such as those described inExamples, 4, 5, 8, 9, 10, 12, and 13) to the Arabidopsis leaves. Inother variations of the procedure, the self-assembled Cas9/ArgNPnanoassemblies are prepared with at least one guide RNA to allow themodified Cas9 to form a ribonucleoprotein (RNP) either prior to or afterformation of the nanoassemblies; the self-assembled RNP/ArgNPnanoassemblies are then delivered to the Arabidopsis leaves. Efficiencyof editing is assessed by any suitable method such as a heteroduplexcleavage assay or by sequencing, as described elsewhere in thisdisclosure.

One of skill in the art would recognize that alternatives to the abovecompositions and procedures can be used to edit plant cells and intactplants, tissues, seeds, and callus. In embodiments, nanoassemblies aremade using other sequence-specific nucleases (e. g., zinc-fingernucleases (ZFNs), transcription activator-like effector nucleases(TAL-effector nucleases or TALENs), Argonaute proteins, or ameganuclease or engineered meganuclease) which can be similarlycharge-modified. In embodiments, nanoassemblies are made using othernanoparticles (e. g., nanoparticles made of materials such as carbon,silicon, silicon carbide, gold, tungsten, polymers, ceramics, ironoxide, or cobalt ferrite) which can be similarly charge-modified inorder to form non-covalent complexes with the charge-modifiedsequence-specific nuclease. Similar nanoassemblies including otherpolypeptides (e. g., phosphatases, hydrolases, oxidoreductases,transferases, lyases, recombinases, polymerases, ligases, andisomerases) or polynucleotides or a combination of polypeptides andpolynucleotides are made using similar charge modification methods toenable non-covalent complexation with charge-modified nanoparticles. Forexample, similar nanoassemblies are made by complexing charge-modifiednanoparticles with one or more polypeptides or ribonucleoproteinsincluding at least one functional domain selected from the groupconsisting of: transposase domains, integrase domains, recombinasedomains, resolvase domains, invertase domains, protease domains, DNAmethyltransferase domains, DNA hydroxylmethylase domains, DNAdemethylase domains, histone acetylase domains, histone deacetylasedomains, nuclease domains, repressor domains, activator domains,nuclear-localization signal domains, transcription-regulatory protein(or transcription complex recruiting) domains, cellular uptake activityassociated domains, nucleic acid binding domains, antibody presentationdomains, histone modifying enzymes, recruiter of histone modifyingenzymes, inhibitor of histone modifying enzymes, histonemethyltransferases, histone demethylases, histone kinases, histonephosphatases, histone ribosylases, histone deribosylases, histoneubiquitinases, histone deubiquitinases, histone biotinases, and histonetail proteases.

Example 9

As described herein, microinjection techniques can be used as analternative to the methods for delivering targeting agents toprotoplasts as described, e.g., in certain Examples above.Microinjection is typically used to target specific cells in isolatedembryo sacs or the shoot apical meristem. See, e.g., U.S. Pat. No.6,300,543, incorporated by reference herein. For example, an injectorattached to a Narashige manipulator on a dissecting microscope isadequate because the cells to be microinjected are relatively large(e.g., the egg/synergids/zygote and the central cell). For smallercells, such as those of the embryo, a compound, inverted microscope withan attached Narashige manipulator is used. Injection pipette diameterand bevel are also important. Use a high quality pipette puller andbeveler to prepare needles with adequate strength, flexibility and porediameter. These will vary depending on the cargo being delivered tocells. The volume of fluid to be microinjected must be exceedingly smalland must be carefully controlled. An Eppendorf Transjector yieldsconsistent results (Laurie et al., 1999).

The genetic cargo can be RNA, DNA, protein or a combination thereof. Thecargo can be designed to change one aspect of the target genome or many.The concentration of each cargo component will vary depending on thenature of the manipulation. Typical cargo volumes can vary from 2-20nanoliters. After microinjection the embryos are maintained on anappropriate media alone (e.g., sterile MS medium with 10% sucrose) orsupplemented with a feeder culture. Plantlets are transferred to freshMS media every two weeks and to larger containers as they grow.Plantlets with a well-developed root system are transferred to soil andmaintained in high-humidity for 5 days to acclimate. Plants aregradually exposed to the air and cultivated to reproductive maturity.

Microinjection of corn embryos: The cobs and tassels are immediatelybagged when they appear to prevent pollination. To obtainzygote-containing maize embryo sacs, hand pollination of silks isperformed when the silks are 6-10 cm long, the pollinated ears arebagged and tassels removed, and then ears are harvested at 16 hourslater. After removing husks and silks, the cobs are cut transverselyinto 3 cm segments. The segments are surface sterilized in 70% ethanoland then rinsed in sterile distilled, deionized water. Ovaries are thenremoved and prepared for sectioning. The initial preparation may includemechanical removal of the ovarian wall, but this may not be required.

Once the ovaries have been removed, they are attached to a Vibratomesectioning block, an instrument designed to produce histologicalsections without chemical fixation or embedment. The critical attachmentstep is accomplished using a commercial adhesive such as Locktitecement. Normally 2-3 pairs of ovaries are attached on each sterilesectioning block with the adaxial ovarian surface facing upwards andperpendicular to the longitudinal axis of the rectangular sectioningblock (Laurie et al., In Vitro Cell Dev Biol., 35: 320-325, 1999).Ovarian sections (or “nucellar slabs”) are obtained at a thickness of200 to 400 micrometers. Ideal section thickness is 200 micrometers. Theembryo sac will remain viable if it is not cut. The sections arecollected with fine forceps and evaluated on a dissecting microscopewith basal illumination. Sections with an intact embryo sac are placedon semi-solid Murashige-Skoog (MS) culture medium (Campenot et al.,1992) containing 15% sucrose and 0.1 mg/L benzylaminopurine. SterilePetriplates containing semi-solid MS medium and nucellar slabs are thenplaced in an incubator maintained at 26° C. These can be monitoredvisually by removing plates from the incubator and examining thenucellar slabs with a dissecting microscope in a laminar flow hood.

Microinjection of soybean embryonic axes: Mature soybean seeds aresurface sterilized using chlorine gas. The gas is cleared by air flow ina sterile, laminar flow hood. Seeds are wetted with 70% ethanol for 30seconds and rinsed with sterile distilled, deionized water thenincubated in sterile distilled, deionized water for 30 minutes to 12hours. The embryonic axes are carefully removed from the cotyledons andplaced in MS media with the radicle oriented downwards and the apexexposed to air. The embryonic leaves are carefully removed with finetweezers to expose the shoot apical meristem.

Microinjection of rice: Rice tissues that are appropriate for genomeediting manipulation include embryogenic callus, exposed shoot apicalmeristems and 1 DAP embryos. There are many approaches to producingembryogenic callus (for example, Tahir 2010(doi:10.1007/978-1-61737-988-8_21); Ge et al., 2006(doi:10.1007/s00299-005-0100-7)). Shoot apical meristem explants can beprepared using a variety of methods in the art (see, e.g., Sticklen andOraby, 2005 (doi:10.1079/IVP2004616); Baskaran and Dasgupta, 2012doi:10.1007/s13562-011-0078-x)). This work describes how to prepare andnurture material that is adequate for microinjection.

To prepare 1 DAP embryos for microinjection, Indica or japonica rice arecultivated under ideal conditions in a greenhouse with supplementallighting with a 13-hour day, day/night temperatures of 30°/20° C.,relative humidity between 60-80%, and adequate fertigation usingHoagland's solution or an equivalent. The 1 DAP zygotes are identifiedand prepped essentially as described in Zhang et al., 1999, Plant CellReports (doi:10.1007/s002990050722). The dissected ovaries with exposedzygotes are placed on the appropriate solid support medium and orientedfor easy access using a microinjection needle. Injection and subsequentgrowth is carried out as described above in this Example.

Microinjection of tomato: Tomato tissues that are appropriate for genomeediting manipulation include embryogenic callus, exposed shoot apicalmeristems and 1 DAP embryos. There are many approaches to producingembryogenic callus (for example, Toyoda et al., 1988(doi:10.1007/BF00269921), Tahir 2010 (DOI 10.1007/978-1-61737-988-821),Ge et al., 2006 (doi:10.1007/s00299-005-0100-7), Senapati, 2016(doi:10.9734/ARRB/2016/22300)). Shoot apical meristem explants can beprepared using a variety of methods in the art (Sticklen and Oraby, 2005(doi:10.1079/IVP2004616), Baskaran and Dasgupta, 2012(doi:10.1007/s13562-011-0078-x), Senapati, 2016(doi:10.9734/ARRB/2016/22300)). This work describes how to prepare andnurture material that is adequate for microinjection.

To prepare one day after germination seedlings for microinjection,tomato seed are germinated under ideal conditions in a growth chamberwith supplemental lighting for a 16-hour day, day/night temperatures of25/20° C., and relative humidity between 60-80%. The one day aftergermination seedlings are identified and prepped essentially asdescribed in Vinoth et al., 2013 (doi:10.1007/s12010-012-0006-0).Germinated seeds with 2-3 mm meristems are placed on the appropriatesolid support medium and oriented for easy access using a microinjectionneedle. Injection and subsequent growth is carried out as describedabove in this Example.

Example 10

This example illustrates a method of changing expression of a sequenceof interest in a genome, comprising integrating sequence encoded by apolynucleotide (such as a double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, or a double-stranded DNA/RNA hybrid)donor molecule at the site of at least one double-strand break (DSB) ina genome. More specifically, this non-limiting example illustrates usinga ribonucleoprotein (RNP) including a guide RNA (gRNA) and a nuclease toeffect a DSB in the genome of a plant, and integration of sequenceencoded by a double-stranded DNA (dsDNA) at the site of the DSB, whereinthe dsDNA molecule includes a sequence recognizable by a specificbinding agent, and wherein contacting the integrated sequence encoded bydsDNA molecule with the specific binding agent results in a change ofexpression of a sequence of interest. In this particular example, thesequence recognizable by a specific binding agent includes a recombinaserecognition site sequence, the specific binding agent is a site-specificrecombinase, and the change of expression is upregulation ordownregulation or expression of a transcript having an altered sequence(for example, expression of a transcript that has had a region of DNAexcised, inverted, or translocated by the recombinase).

The loxP (“locus of cross-over”) recombinase recognition site and itscorresponding recombinase Cre, were originally identified in the P1bacteriophage. The wild-type loxP 34 base-pair sequence is

(SEQ ID NO: 7) ATAACTTCGTATAGCATACATTATACGAAGTTATand includes two 13 base-pair palindromic sequences flanking an 8base-pair spacer sequence; the spacer sequence, shown in underlinedfont, is asymmetric and provides directionality to the loxP site. Otheruseful loxP variants or recombinase recognition site sequence thatfunction with Cre recombinase are provided in Table 2.

TABLE 2 Cre SEQ recombinase ID ecognition NO: site Sequence 7 LoxPATAACTTCGTATAGCATACATTATACGAAGTTAT (wild- type 1) 8 LoxPATAACTTCGTATAATGTATGCTATACGAAGTTAT (wild- type 2) 9 CanonicalATAACTTCGTATANNNTANNNTATACGAAGTTAT LoxP 10 Lox 511ATAACTTCGTATAATGTATACTATACGAAGTTAT 11 Lox 5171ATAACTTCGTATAATGTGTACTATACGAAGTTAT 12 Lox 2272ATAACTTCGTATAAAGTATCCTATACGAAGTTAT 13 M2ATAACTTCGTATAAGAAACCATATACGAAGTTAT 14 M3ATAACTTCGTATATAATACCATATACGAAGTTAT 15 M7ATAACTTCGTATAAGATAGAATATACGAAGTTAT 16 M11ATAACTTCGTATAAGATAGAATATACGAAGTTAT 17 Lox 71TACCGTTCGTATANNNTANNNTATACGAAGTTAT 18 Lox 66ATAACTTCGTATANNNTANNNTATACGAACGGTA

Cre recombinase catalyzes the recombination between two compatible(non-heterospecific) loxP sites, which can be located either on the sameor on separate DNA molecules. Thus, in embodiments of the invention,polynucleotide (such as double-stranded DNA, single-stranded DNA,single-stranded DNA/RNA hybrid, or double-stranded DNA/RNA hybrid)molecules including compatible recombinase recognition sites sequenceare integrated at the site of two or more double-strand breaks (DSBs) ina genome, which can be on the same or on separate DNA molecules (such aschromosomes). Depending on the number of recombinase recognition sites,where these are integrated, and in what orientation, various results areachieved, such as expression of a transcript that has had a region ofDNA excised, inverted, or translocated by the recombinase. For example,in the case where one pair of loxP sites (or any pair of compatiblerecombinase recognition sites) are integrated at the site of DSBs in thegenome, if the loxP sites are on the same DNA molecule and integrated inthe same orientation, the genomic sequence flanked by the loxP sites isexcised, resulting in a deletion of that portion of the genome. If theloxP sites are on the same DNA molecule and integrated in oppositeorientation, the genomic sequence flanked by the loxP sites is inverted.If the loxP sites are on separate DNA molecules, translocation ofgenomic sequence adjacent to the loxP site occurs. Examples ofheterologous arrangements or integration patterns of recombinaserecognition sites and methods for their use, particularly in plantbreeding, are disclosed in U.S. Pat. No. 8,816,153 (see, for example,the Figures and working examples), the entire specification of which isincorporated herein by reference.

One of skill in the art would recognize that the details provided hereare applicable to other recombinases and their corresponding recombinaserecognition site sequences, such as, but not limited to, FLP recombinaseand frt recombinase recognition site sequences, R recombinase and Rsrecombinase recognition site sequences, Dre recombinase and roxrecombinase recognition site sequences, and Gin recombinase and gixrecombinase recognition site sequences.

Example 11

This example illustrates compositions and reaction mixtures useful fordelivering at least one effector molecule for inducing a geneticalteration in a plant cell or plant protoplast.

Sequences of plasmids for delivery of Cas9 (Csn1) endonuclease from theStreptococcus pyogenes Type II CRISPR/Cas system and for delivery of asingle guide RNA (sgRNA) are provided in Tables 3 and 4. In thisnon-limiting example, the sgRNA targets the endogenous phytoenedesaturase (PDS) in soybean, Glycine max; one of skill would understandthat other sgRNA sequences for alternative target genes could besubstituted in the plasmid.

TABLE 3 sgRNA vector (SEQ ID NO: 677), 3079 base pairs DNA Nucleotideposition in SEQ ID NO: 677 Description Comment   1-3079 Intact plasmidSEQ ID NO: 677 379-395 M13 forward primer for sequencing 412-717 Glycinemax U6 promoter 717-736 Glycine max phytoene desaturase targeting SEQ IDNO: 678 sequence (gRNA) 737-812 guide RNA scaffold sequence for S.pyogenes SEQ ID NO: 679 CRISPR/Cas9 system 856-874 M13 reverse primerfor sequencing complement 882-898 lac repressor encoded by lacI 906-936lac promoter for the E. coli lac operon complement 951-972 E. colicatabolite activator protein (CAP) binding site 1260-1848high-copy-number ColE1/pMB1/pBR322/pUC complement origin of replication(left direction) 2019-2879 CDS for bla, beta-lactamase, AmpR complement;ampicillin selection 2880-2984 bla promoter complement

The sgRNA vector having the sequence of SEQ ID NO:677 containsnucleotides at positions 717-812 encoding a single guide RNA having thesequence of SEQ ID NO:680(GAAGCAAGAGACGTTCTAGGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC), which includes both a targetingsequence (gRNA) (GAAGCAAGAGACGTTCTAGG, SEQ ID NO:678) and a guide RNAscaffold (GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC, SEQ ID NO:679); transcription of the sgRNA is driven bya Glycine max U6 promoter at nucleotide positions 412-717. The sgRNAvector also includes lac operon and ampicillin resistance sequences forconvenient selection of the plasmid in bacterial cultures.

TABLE 4 endonuclease vector (SEQ ID NO: 681), 8569 base pairs DNANucleotide position in SEQ ID NO: 681 Description Comment   1-8569Intact plasmid SEQ ID NO: 681 379-395 M13 forward primer for sequencing 419-1908 Glycine max UbiL promoter 1917-6020 Cas9 (Csn1) endonucleasefrom the Streptococcus SEQ ID NO: 682 (encodes pyogenes type IICRISPR/Cas system protein with sequence of SEQ ID NO: 683) 6033-6053nuclear localization signal of SV40 large T SEQ ID NO: 684 (encodesantigen peptide with sequence of SEQ ID NO: 685 6065-6317 nopalinesynthase (NOS) terminator and poly(A) signal 6348-6364 M13 reverseprimer for sequencing complement 6372-6388 lac repressor encoded by lacI6396-6426 lac promoter for the E. coli lac operon complement 6441-6462E. coli catabolite activator protein (CAP) binding site 6750-7338high-copy-number ColE1/pMB1/pBR322/pUC complement origin of replication(left direction) 7509-8369 CDS for bla, beta-lactamase, AmpR complement;ampicillin selection 8370-8474 bla promoter complement

The endonuclease vector having the sequence of SEQ ID NO:681 containsnucleotides at positions 1917-6020 having the sequence of SEQ ID NO:682and encoding the Cas9 nuclease from Streptococcus pyogenes that has theamino acid sequence of SEQ ID NO:683, and nucleotides at positions6033-6053 having the sequence of CCTAAGAAGAAGAGGAAGGTT (SEQ ID NO:684)and encoding the nuclear localization signal (NLS) of simian virus 40(SV40) large T antigen that has the amino acid sequence of PKKKRKV (SEQID NO:685). Transcription of the Cas9 nuclease and adjacent SV40 nuclearlocalization signal is driven by a Glycine max UbiL promoter atnucleotide positions 419-1908; the resulting transcript includingnucleotides at positions 1917-6053 having the sequence of SEQ ID NO:686encodes a fusion protein having the sequence of SEQ ID NO:687 whereinthe Cas9 nuclease is linked through a 4-residue peptide linker to theSV40 nuclear localization signal. The endonuclease vector also includeslac operon and ampicillin resistance sequences for convenient selectionof the plasmid in bacterial cultures.

Similar vectors for expression of nucleases and sgRNAs are alsodescribed, e. g., in Fauser et al. (2014) Plant J., 79:348-359; anddescribed at www[dot]addgene[dot[org/crispr. It will be apparent to oneskilled in the art that analogous plasmids are easily designed to encodeother guide polynucleotide or nuclease sequences, optionally includingdifferent elements (e. g., different promoters, terminators, selectableor detectable markers, a cell-penetrating peptide, a nuclearlocalization signal, a chloroplast transit peptide, or a mitochondrialtargeting peptide, etc.), and used in a similar manner. Embodiments ofnuclease fusion proteins include fusions (with or without an optionalpeptide linking sequence) between the Cas9 nuclease from Streptococcuspyogenes that has the amino acid sequence of SEQ ID NO:683 and at leastone of the following peptide sequences: (a) GRKKRRQRRRPPQ (“HIV-1 Tat(48-60)”, SEQ ID NO:688), (b) GRKKRRQRRRPQ (“TAT”, SEQ ID NO:689), (c)YGRKKRRQRRR (“TAT (47-57)”, SEQ ID NO:690), (d) KLALKLALKALKAALKLA (“MAP(KLAL)”, SEQ ID NO:691), (e) RQIRIWFQNRRMRWRR (“Penetratin-Arg”, SEQ IDNO:692), (f) CSIPPEVKFNKPFVYLI (“antitrypsin (358-374)”, SEQ ID NO:693),(g) RRRQRRKKRGGDIMGEWGNEIFGAIAGFLG (“TAT-HA2 Fusion Peptide”, SEQ IDNO:694), (h) FVQWFSKFLGRIL-NH2 (“Temporin L, amide”, SEQ ID NO:695), (i)LLIILRRRIRKQAHAHSK (“pVEC (Cadherin-5)”, SEQ ID NO:696), (j)LGTYTQDFNKFHTFPQTAIGVGAP (“Calcitonin”, SEQ ID NO:697), (k)GAAEAAARVYDLGLRRLRQRRRLRRERVRA (“Neurturin”, SEQ ID NO:698), (1)MGLGLHLLVLAAALQGAWSQPKKKRKV (“Human P1”, SEQ ID NO:699), (m)RQIKIWFQNRRMKWKKGG (“Penetratin”, SEQ ID NO:700), poly-arginine peptidesincluding (n) RRRRRRRR (“octo-arginine”, SEQ ID NO:701) and (o)RRRRRRRRR (“nono-arginine”, SEQ ID NO:702), and (p)KKLFKKILKYLKKLFKKILKYLKKKKKKKK (“(BP100x2)-K8”, SEQ ID NO:703); thesenuclease fusion proteins are specifically claimed herein, as areanalogous fusion proteins including a nuclease selected from Cpf1, CasY,CasX, C2c1, or C2c3 and at least one of the peptides having a sequenceselected from SEQ ID NOs:688-703. In other embodiments, such vectors areused to produce a guide RNA (such as one or more crRNAs or sgRNAs) orthe nuclease protein; guide RNAs and nucleases can be combined toproduce a specific ribonucleoprotein complex for delivery to the plantcell; in an example, a ribonucleoprotein including the sgRNA having thesequence of SEQ ID NO:680 and the Cas9-NLS fusion protein having thesequence of SEQ ID NO:687 is produced for delivery to the plant cell.Related aspects of the invention thus encompass ribonucleoproteincompositions containing the ribonucleoprotein including the sgRNA havingthe sequence of SEQ ID NO:680 and a Cas9 fusion protein such as theCas9-NLS fusion protein having the sequence of SEQ ID NO:687, andpolynucleotide compositions containing one or more polynucleotidesincluding the sequences of SEQ ID NOs:680 or 686. The above sgRNA andnuclease vectors are delivered to plant cells or plant protoplasts usingcompositions and methods described in the specification.

A plasmid (“pCas9TPC-GmPDS”) having the nucleotide sequence of SEQ IDNO:704 was designed for simultaneous delivery of Cas9 (Csn1)endonuclease from the Streptococcus pyogenes Type II CRISPR/Cas systemand a single guide RNA (sgRNA) targeting the endogenous phytoenedesaturase (PDS) in soybean, Glycine max. In this non-limiting example,the sgRNA targets the endogenous phytoene desaturase (PDS) in soybean,Glycine max; one of skill would understand that other sgRNA sequencesfor alternative target genes could be substituted in the plasmid. Thesequences of this plasmid and specific elements contained therein aredescribed in Table 5 below.

TABLE 5 pCas9TPC-GmPDS vector (SEQ ID NO: 704), 14548 base pairs DNANucleotide position in SEQ ID NO: 704 Description Comment   1-14548Intact plasmid SEQ ID NO: 704 1187-1816 pVS1 StaA stability protein fromthe Pseudomonas plasmid pVS1 2250-3317 pVS1 RepA replication proteinfrom the Pseudomonas plasmid pVS1 3383-3577 pVS1 oriV origin ofreplication for the Pseudomonas plasmid pVS1 3921-4061 basis of mobilityregion from pBR322 4247-4835 high-copy-number ColE1/pMB1/pBR322/pUCcomplement origin of replication (left direction) 5079-5870aminoglycoside adenylyltransferase (aadA), complement confers resistanceto spectinomycin and streptomycin 6398-6422 left border repeat fromnopaline C58 T-DNA 6599-6620 E. coli catabolite activator protein (CAP)binding site 6635-6665 lac promoter for the E. coli lac operon 6673-6689lac repressor encoded by lacI 6697-6713 M13 reverse primer forsequencing 6728-7699 PcUbi4-2 promoter  7714-11817 Cas9 (Csn1)endonuclease from the Streptococcus SEQ ID NO: 682 (encodes pyogenestype II CRISPR/Cas system protein with sequence of SEQ ID NO: 683)11830-11850 nuclear localization signal of SV40 large T SEQ ID NO: 684(encodes antigen peptide with sequence of SEQ ID NO: 685 11868-12336Pea3A terminator 12349-12736 AtU6-26 promoter 12737-12756 Glycine maxphytoene desaturase targeting SEQ ID NO: 678 sequence (gRNA) 12757-12832guide RNA scaffold sequence for S. pyogenes SEQ ID NO: 679 CRISPR/Cas9system 12844-12868 attB2; recombination site for Gateway ® BP complementreaction 13549-14100 Streptomyces hygroscopicus bar or pat, encodesphosphinothricin acetyltransferase, confers resistance to bialophos orphosphinothricin 14199-14215 M13 forward primer, for sequencingcomplement 14411-14435 right border repeat from nopaline C58 T-DNA

The pCas9TPC-GmPDS vector having the sequence of SEQ ID NO:704 containsnucleotides at positions 12737-12832 encoding a single guide RNA havingthe sequence of SEQ ID NO:680, which includes both a targeting sequence(gRNA) (SEQ ID NO:678) and a guide RNA scaffold (SEQ ID NO:679);transcription of the single guide RNA is driven by a AtU6-26 promoter atnucleotide positions 12349-12736. This vector further containsnucleotides at positions 7714-11817 having the sequence of SEQ ID NO:682and encoding the Cas9 nuclease from Streptococcus pyogenes that has theamino acid sequence of SEQ ID NO:683, and nucleotides at positions11830-11850 having the sequence of SEQ ID NO:684 and encoding thenuclear localization signal (NLS) of simian virus 40 (SV40) large Tantigen that has the amino acid sequence of SEQ ID NO:685. Transcriptionof the Cas9 nuclease and adjacent SV40 nuclear localization signal isdriven by a PcUbi4-2 promoter at nucleotide positions 6728-7699; theresulting transcript including nucleotides at positions 7714-11850having the sequence of SEQ ID NO:686 encodes a fusion protein having thesequence of SEQ ID NO:687 wherein the Cas9 nuclease is linked through a4-residue peptide linker to the SV40 nuclear localization signal. ThepCas9TPC-GmPDS vector also includes lac operon, aminoglycosideadenylyltransferase, and phosphinothricin acetyltransferase sequencesfor convenient selection of the plasmid in bacterial or plant cultures.

A plasmid (“pCas9TPC-NbPDS”) having the nucleotide sequence of SEQ IDNO:705 was designed for simultaneous delivery of Cas9 (Csn1)endonuclease from the Streptococcus pyogenes Type II CRISPR/Cas systemand a single guide RNA (sgRNA) targeting the endogenous phytoenedesaturase (PDS) in Nicotiana benthamiana; see Nekrasov et al. (2013)Nature Biotechnol., 31:691-693. In this non-limiting example, the sgRNAtargets the endogenous phytoene desaturase (PDS) in Nicotianabenthamiana; one of skill would understand that other sgRNA sequencesfor alternative target genes could be substituted in the plasmid. Thesequences of this plasmid and specific elements contained therein aredescribed in Table 6 below.

TABLE 6 pCas9TPC-NbPDS vector (SEQ ID NO: 705), 14548 base pairs DNANucleotide position in SEQ ID NO: 705 Description Comment   1-14548Intact plasmid SEQ ID NO: 705 1187-1816 pVS1 StaA stability protein fromthe Pseudomonas plasmid pVS1 2250-3317 pVS1 RepA replication proteinfrom the Pseudomonas plasmid pVS1 3383-3577 pVS1 oriV origin ofreplication for the Pseudomonas plasmid pVS1 3921-4061 basis of mobilityregion from pBR322 4247-4835 high-copy-number ColE1/pMB1/pBR322/pUCComplement origin of replication (left direction) 5079-5870aminoglycoside adenylyltransferase (aadA), Complement confers resistanceto spectinomycin and streptomycin 6398-6422 left border repeat fromnopaline C58 T-DNA 6599-6620 E. coli catabolite activator protein (CAP)binding site 6635-6665 lac promoter for the E. coli lac operon 6673-6689lac repressor encoded by lacI 6697-6713 M13 reverse primer forsequencing 6728-7699 PcUbi4-2 promoter  7714-11817 Cas9 (Csn1)endonuclease from the Streptococcus SEQ ID NO: 682 (encodes pyogenestype II CRISPR/Cas system protein with sequence of SEQ ID NO: 683)11830-11850 nuclear localization signal of SV40 large T SEQ ID NO: 684(encodes antigen peptide with sequence of SEQ ID NO: 685 11868-12336Pea3A terminator 12349-12736 AtU6-26 promoter 12737-12756 Nicotianabenthamiana phytoene desaturase SEQ ID NO: 706 targeting sequence12757-12832 guide RNA scaffold sequence for S. pyogenes SEQ ID NO: 679CRISPR/Cas9 system 12844-12868 attB2; recombination site for Gateway ®BP Complement reaction 13549-14100 Streptomyces hygroscopicus bar orpat, encodes phosphinothricin acetyltransferase, confers resistance tobialophos or phosphinothricin 14199-14215 M13 forward primer, forsequencing Complement 14411-14435 right border repeat from nopaline C58T-DNA

The pCas9TPC-NbPDS vector having the sequence of SEQ ID NO:705 containsnucleotides at positions 12737-12832 encoding a single guide RNA havingthe sequence of SEQ ID NO:707(GCCGTTAATTTGAGAGTCCAGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC), which includes both a targetingsequence (gRNA) (GCCGTTAATTTGAGAGTCCA, SEQ ID NO:706) and a guide RNAscaffold (SEQ ID NO:679); transcription of the single guide RNA isdriven by a AtU6-26 promoter at nucleotide positions 12349-12736. Thisvector further contains nucleotides at positions 7714-11817 having thesequence of SEQ ID NO:682 and encoding the Cas9 nuclease fromStreptococcus pyogenes that has the amino acid sequence of SEQ IDNO:683, and nucleotides at positions 11830-11850 having the sequence ofSEQ ID NO:684 and encoding the nuclear localization signal (NLS) ofsimian virus 40 (SV40) large T antigen that has the amino acid sequenceof SEQ ID NO:685. Transcription of the Cas9 nuclease and adjacent SV40nuclear localization signal is driven by a PcUbi4-2 promoter atnucleotide positions 6728-7699; the resulting transcript includingnucleotides at positions 7714-11850 having the sequence of SEQ ID NO:686encodes a fusion protein having the sequence of SEQ ID NO:687 whereinthe Cas9 nuclease is linked through a 4-residue peptide linker to theSV40 nuclear localization signal. The pCas9TPC-NbPDS vector alsoincludes lac operon, aminoglycoside adenylyltransferase, andphosphinothricin acetyltransferase sequences for convenient selection ofthe plasmid in bacterial or plant cultures.

Example 12

This example describes the preparation of reagents to create noveldiversity in a region of the genome where low recombination frequencyhas prevented plant breeders from being able to select for novelalleles.

Soybean protoplasts are prepared as described in Examples 1-5.Preparation of reagents is completed essentially as described inExamples 6-9.

The gene selected is SHAT1-5 (see www.uniprot.org/uniprot/W8E7P1), amajor domestication gene in soybean responsible for the reduced podshattering that is required for harvestability (doi:10.1038/ncomms4352).The selective sweep and apparent low rate of recombination at this locushas resulted in no detectable genetic diversity across a 116 kb regionof Glycine max chromosome 16 including 5 genes. As such, breeders havenot been able to select different alleles of SHAT1-5 or diverse allelesfor the surrounding 5 genes. A partial genomic sequence of SHAT1-5 isprovided asCACGTGGCCCCACACACATTTTTTTTCCCTCAACAGTTAAACTCTCTTCCTCCATCTTTCTTGGTAGGTGGCACTTCTCGGAGCATAGTAAAACTAACCCCA

TCATTTTC ATTATATTATAAACCTATATATATACCCAATTGGTTATTGGTGTCTGGTGTCCCTTCAACCTTTAAAACAAACAAATCCattttctttttcttttttttttcattttattttttccattattttatCAACACAATTAATTCCA

CTGTCCCACAGCACATATATATAGTCTCGCTTTACATACTCATTCCATGG CCAGTACACACACCA

TCAATTCCTATCCTCTTCCTTGTAGTGTACCCATTTTGAATGTGTtctctctctctctctctttTTAGGTCCCTGGTGAATATCTAGAACCACTCTCT(SEQ ID NO:708). A SHAT1-repressor nucleotide sequence encoded by apolynucleotide (such as a double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, or a double-stranded DNA/RNA hybrid)donor molecule having the sequenceATTAAAAAAATAAATAAGATATTATTAAAAAAATAAATAAGATATTATTAAAAAAATAAAT AAGATATTATTAAAAAAATAAATAAGATATT (SEQ ID NO:709) is designed for insertion at adouble-strand break effected between nucleotides at positions 103/104,274/275, or 359/360 of SEQ ID NO:708 (insertion is in between theunderlined nucleotides) in order to reduce the expression of SHAT1-5gene. The nucleotides targeted by each of the three different SHAT1-5crRNAs are shown in bold italic in SEQ ID NO:708; the crRNA sequencesare provided as AAAUGAAAAAGAAAAAUGUGGUUUUAGAGCUAUGCU (SEQ ID NO:710),AAAGGACCAAAGGAUACACAGUUUUAGAGCUAUGCU (SEQ ID NO:711), andAAGAAAGAUAUAAUGAGGUGGUUUUAGAGCUAUGCU (SEQ ID NO:712). All crRNA andtracrRNA are purchased from Integrated DNA Technologies, Coralville,Iowa Ribonucleoprotein (RNP) complexes are prepared using gRNA(crRNA:tracrRNA) and Cas9 nuclease (Aldevron, Fargo, N. Dak.).

Example 13

The determinate habit of soybean (Glycine max) is controlled by arecessive allele at the determinate stem (Dt1, GeneID: 100776154) locus.Dt1 encodes the GmTFL1b protein, an ortholog of Arabidopsis TERMINALFLOWER 1 (TFL1). The TFL1 gene in Arabidopsis maintains theindeterminate growth of the SAM by inhibiting the expression of thefloral meristem identity genes LFY and AP1. Down-regulation of Dasuppressed the indeterminate growth at shoot apical meristems inindeterminate plants (doi:10.1104/pp. 109.150607). Knockout Dt1 canconvert indeterminate soybean varieties to determinate soybeans.

Dt1 gene exon 1 to exon 2 (473 bp, 83-555 bp downstream of TSS, exonsare in uppercase) is:

(SEQ ID NO: 713) ATGGCAAGAATGCCTTTAGAGCCTCTAATAGTGGGGAGAGTCATAGGAGAAGTTCTTGATTCTTTTACCACAAGCACAAAAATGATTGTGAGTTACAACAAGAATCAAGTCTACAATGGCCATGAACTCTTCCCTTCCACTGTCAACACCAAGCCCAAGGTTGAGATTGAGGGTGGTGATATGAGGTCCTTCTTTACACTGgtactatatatatatcttttatctcctctcttattccttttctctttaaagaacaaagttttttgggaaaaaaaagagaggaaaaacttgagctgttgccagtgtgtatctttgttgtttcctgtaattcagctcacacagtaagtctcttttggagttttctttaccaatactgaatcattaagctaatgtctcgcttttttggtgcagATCATGACTGACCCTGATGTTCCTGGCCCTAGTGACCCTTATCTGAGAGAGCACTTGCACTG.

Two RNP with cutting sites at 225 bp and 511 bp downstream of TSS aredelivered to cells together to knockout DO gene. The crRNA sequences are

(SEQ ID NO: 714) CUUGGGCUUGGUGUUGACAGGUUUUAGAGCUAUGCU and(SEQ ID NO: 715) CACUAGGGCCAGGAACAUCAGUUUUAGAGCUAUGCU.

As an exemplary assay for readout, QPCR can be used to check thetranscript level of Da and CRISPR amplicon sequencing can be used toconfirm the loss-of-function editing. Phenotypic readout of editedplants can be determined with determinate phenotype.

Example 14

Soybean is a facultative short-day plant. Rich genetic variability inphotoperiod responses enables the crop to adapt to a wide range oflatitudes. Ten major genes have been identified so far to control timeto flowering and maturity: E1, E2, E3, E4, E5, E6, E7, E8, E9 and J. E9was identified through the molecular dissection of a QTL for earlyflowering introduced from a wild soybean accession. E9 encodes FT2a(GeneID: 100814951), an ortholog of Arabidopsis FLOWERING LOCUS T. Itsrecessive allele with lower transcript expression level delays flowering(doi:10.1186/s12870-016-0704-9). Overexpression of FT2a showed earlyflowering and increased pods per node (US20160304891 A1). Insertion ofE2F binding site in FT2a promoter could increase the expression of FT2ain the meristem.

FT2a promoter (500 bp upstream of TSS) is:

(SEQ ID NO: 716) AATTAATTGACAAAAAATGGTTTCTGTTTCATATAGAAACTATGTTTTTGTTGTGTAGTCATACATTACGGAATCTAGTTTCCATTAAATAAGTAACGTGAAAAAAAATAAAAGGTGAAATATATATTGTTGGAAAAGAAGCTATGAGGTGCAAGAACCGATCACATGGAGAAGGCAATGAAAGACAAGGAGGAGCAATGGAAGAGAGAAAATGAGAAGATGGAAGGGATGTGAAAATGTTTGAAAAAAACGAGGTGATCAGTTTTAAAATACGAATTTAGTATTTTCTTTTTAAGAAAATTCTTTCGGAAAGTCGTGTTTTAAAACATGACTTTTATTTATTTGAAGTCGTGTTCTAAAACATGACTTTATTTCATATCCTTTAATATTTTATATCCTTAATATTTTTAAAATTTATCCATTTGTAATATTTTTTAAAAATTGACCCATATATGTAAAATACCCGTCAAGATCTCTTTATTATTTTGAA AGCGAAAGCA.

The E2F binding site (4X)(5′/Phos/C*C*CGCCAAACCCGCCAAACCCGCCAAACCCGCCA*A*A-3′, SEQ ID NO:213) isinserted in Cas9/guide RNA cutting site, 357 bp, 251 bp, or 31 bpupstream of TSS. The crRNA sequences areUUGUUGGAAAAGAAGCUAUGGUUUUAGAGCUAUGCU (SEQ ID NO:717),GAAAAUGUUUGAAAAAAACGGUUUUAGAGCUAUGCU (SEQ ID NO:718), andAAAUAAUAAAGAGAUCUUGAGUUUUAGAGCUAUGCU (SEQ ID NO:719).

As an exemplary assay for readout, QPCR can be used to check thetranscript level of FT2a when transiently expressing Arabidopsis E2F-a(doi:10.1074/jbc.M205125200) compared to the control (non-edited).Flowering and architecture phenotype are compared to the wild-type(non-edited).

Example 15

Stay-green refers to the heritable delayed foliar senescence characterin model and crop plant species. In functional stay-greens, thetransition from the carbon capture period to the nitrogen mobilization(senescence) phase of canopy development is delayed. Quantitative traitloci studies show that functional stay-green is a valuable trait forimproving crop stress tolerance (doi:10.1093/jxb/eru037). Virus-inducedgene silencing-mediated silencing of GmNAC81 (GeneID: 732555) delayedleaf senescence and was associated with reductions in chlorophyll lossand lipid peroxidation. Insertion of a silencer to downregulate theexpression of GmNAC81 could lead to functional stay-green phenotype(doi:10.1093/pcp/pcw059).

NAC81 promoter (500 bp upstream of TSS) is:

(SEQ ID NO: 720) CGTACATTTTTTTTCTTACATGCTGAAATGGAAGAAATTAAAGAACATAAAGTATAAAGTATGGTCATGAAAATTGTAAGAGGAATTTCCGGGTAAAAGTCCAAAACCGAAAGAAAAAATAGGTCAAGCCATAACATGTATATGCTGTCCCACCACAGTTTAGTCTCTGCAGACTTTTTGTCAAGCCGCGTGGGTCCCACTTCTGGCGGACCCACCACACTAATGTCGTAATAATGTGGAGAGTCGCAATTACAAGTCCATTTTCTTTCAAGATTTCTTAGAGACTTTTGTGCCCCCTAGGCCTCCACGACCAAGTCATAACCCAAACTCAAATATTTAATAAAAATAAATCCATCAATTAGCATAGTTATGAAACCAACCATTCCTTATAAATACCCTCACAACACATATTCATTTTATATCAACTAACTTTGTGCTTCCTCCGCAGAAAAATAAAAAAAGAGTAGCTAGCACTAGCTAGCTAAACACA GTACGAGTAG.

The Silencer element (5′-/Phos/G*A*ATA TAT ATA TAT*T*C-3′, SEQ IDNO:215) is inserted in Cas9/guide RNA cutting site, 342 bp, 201 bp, or98 bp upstream of TSS to decrease the expression of NAC81. The crRNAsequences are

(SEQ ID NO: 721) AGUCUGCAGAGACUAAACUGGUUUUAGAGCUAUGCU, (SEQ ID NO: 722)ACUUGGUCGUGGAGGCCUAGGUUUUAGAGCUAUGCU, and (SEQ ID NO: 723)UAAAAUGAAUAUGUGUUGUGGUUUUAGAGCUAUGCU.

As an exemplary assay for readout, QPCR can be used to check thetranscript level of NAC81.

Example 16

MicroRNAs (miRNAs) regulate gene expression by mediating gene silencingat transcriptional and post-transcriptional levels in higher plants.U.S. Pat. No. 9,040,774 B2 provides the methods for manipulatingexpression of a miRNA regulated target gene by interfering with thebinding of the miRNA to its target gene. The miR172 family target mRNAscoding APETALA2-like transcription factors. Utilization of decoy ormiRNA cleavage blocker of miR172 showed improved yield in multiple cropsincluding soybean (see Table 3 in U.S. Pat. No. 9,040,774 B2). Mutationof miRNA targeting site in its target genes listed in Table 7 can leadto improved yield.

TABLE 7 miRNA Target Gene Target Gene Annotation gma-MIR172b-Glyma01g39520 transcription factor activity 5p Glyma03g33470transcription factor activity Glyma05g09400 protein kinase C activationGlyma11g05720 transcription factor activity Glyma11g10790 RNA-bindingprotein Glyma14g01950 A2L zinc ribbon domain Glyma19g36200 transcriptionfactor activity gma-MIR172c Glyma01g39520 AP2 domain-containingtranscription factor Glyma03g33470 AP2 domain-containing transcriptionfactor Glyma05g18170 AP2 domain-containing transcription factorGlyma05g31790 GTPase Rab2, small G protein superfamily Glyma08g15040GTPase Rab2, small G protein superfamily Glyma11g05720 AP2domain-containing transcription factor Glyma11g15650 AP2domain-containing transcription factor Glyma12g07800 AP2domain-containing transcription factor Glyma13g40470 AP2domain-containing transcription factor Glyma15g04930 AP2domain-containing transcription factor Glyma17g18640 AP2domain-containing transcription factor Glyma19g36200 AP2domain-containing transcription factor gma-MIR172g Glyma06g15630ubiquitin-protein ligase activity Glyma10g27970 ATP binding cassetteprotein gma-MIR172h- Glyma01g39520 AP2 domain-containing transcriptionfactor 3p Glyma03g33470 AP2 domain-containing transcription factorGlyma11g05720 AP2 domain-containing transcription factor Glyma11g15650AP2 domain-containing transcription factor Glyma12g07800 AP2domain-containing transcription factor Glyma13g40470 AP2domain-containing transcription factor (GeneID: 100777102) Glyma15g04930AP2 domain-containing transcription factor Glyma19g36200 AP2domain-containing transcription factor gma-MIR172h- Glyma06g13450putative ATP-dependent Clp-type protease 5p Glyma10g08730 nitrate,fromate, iron dehydrogenase Glyma10g30570 targeting protein for Xklp2Glyma11g05580 GTP-binding ADP-ribosylation factor Glyma11g06830ubiquitin-protein ligase Glyma15g12600 protease inhibitor

RAP2-7-LIKE Exon 10 (miR172 targeting site is underlined) is:

(SEQ ID NO: 724) GAAAGAGCAGAGAGAATGGGCACAGATCCTTCAAAAGGAGTCCCAAACCCCAACTGGGCGTGGCAAACACATGGCCAGGTTACTGACACCCCAGTACCACCGTTCTCTACTGCAGCATCATCAGGATTCTCAATTTCAGCCACTTTTCCATCAACTGCCATCTTTCCAACAAAATCCATCAACTCAGTTCCCCATAGCCTCTGTTTCACTTCACCCAGCACACCAGGTAGCAACGCACCTCAATTCTATAACTATTACGAGGTGAAGTCCCCGCAGCCACCGTCCTAG.

Cas9/guide RNA has cutting site at 3968 bp downstream of TSS. The crRNAsequence is GAUGCUGCAGUAGAGAACGGGUUUUAGAGCUAUGCU (SEQ ID NO:725).

Donor template consisting silent mutations of the miR172 targeting sitewith flanking regions of 100 bp (underlined) is:

(SEQ ID NO: 726) GAGAATGGGCACAGATCCTTCAAAAGGAGTCCCAAACCCCAACTGGGCGTGGCAAACACATGGCCAGGTTACTGACACCCCAGTACCACCGTTCTCTACTGCTGCCTCTTCCGGTTTTTCCATTTCAGCCACTTTTCCATCAACTGCCATCTTTCCAACAAAATCCATCAACTCAGTTCCCCATAGCCTCTGTTTCACTTCACCCAGCACACCAGGTAGCA.

As an exemplary assay for readout, QPCR can be used to check thetranscript level of RAP2-7-like.

Example 17

Photoperiod responsiveness is a key factor in latitudinal adaption ofsoybean. Introduction of the long juvenile trait extends the vegetativephase and improves yield under short-day conditions, thereby enablingexpansion of cultivation in tropical regions. J (GeneID: 100793561), themajor classical locus conferring the trait, is the ortholog ofArabidopsis thaliana EARLY FLOWERING 3(ELF3). J protein downregulate E1transcription, relieving repression of two important FLOWERING LOCUS T(FT) genes and promoting flowering under short days(doi:10.1038/ng.3819). Reduction of the J transcript level can releaseE1 from repression and E1 then repress FT2a and FT5a, resulting inimproving adaptation of varieties from temperate regions to the tropicsand enhancing the yield.

J 3′-UTR region (344 bp) is:

(SEQ ID NO: 727) TGATGGTTCAAGTAGTTTGTCTAGTTCCTGTACTTTCTTGGAGTATGTCATGTAACGAGCTGTTGTATTTATAATTTTTGTTTTGGTTTTTGACCTTGTTACATACAGCACATTGGTATGTAGATATATCTGGCATATCAAAACTGGTCAAAATGTAACATTATTTTATGGTATCATGTTGTTTCCATACATAAGTGTTCGTTTTACACAGTAGTAATTGCTTTACCCGAAAGAGTAGGTGCTCTGCTTCCTCTGTGCATGGGAGGAGGTTTTATATTGCTTGCCTAGAAGACTAGTGATTGTCATACACATTAGCTGTTATTCTAATTGATTGTTTCATGTCA.

The mRNA destabilizing element (doi:10.1105/tpc.107.055046)(5′/Phos/A*A*TTTTAATTTTAATTTTAATTTTAATTTTAATT*T*T-3′, SEQ ID NO:214) isinserted in Cas9/guide RNA cutting site, 5032 bp, 5110 bp, or 5255 bpdownstream of TSS. The crRNA sequences areGACAUACUCCAAGAAAGUACGUUUUAGAGCUAUGCU (SEQ ID NO:728),CCUUGUUACAUACAGCACAUGUUUUAGAGCUAUGCU (SEQ ID NO:729), andAAACCUCCUCCCAUGCACAGGUUUUAGAGCUAUGCU (SEQ ID NO:730).

As an exemplary assay for readout, QPCR can be used to check thetranscript level of J. Phenotypic readout of edited plants can bedetermined with late flowering.

Example 18

Abscisic acid (ABA) plays a crucial role in the plant response to bothbiotic and abiotic stresses. Mutations in PYR/PYL receptor proteins havebeen identified that result in hypersensitivity to ABA to enhance plantdrought resistance (U.S. Patent Application Publication 2016/0194653).Mutation on GmPYL9 (GeneID: 100810273) E137 corresponding to the aminoacid E141 in Arabidopsis thaliana PYR1 can enhance the sensitivity toABA.

GmPYL9 exon 3 (216 bp, 2827-3042 bp downstream of TSS, the nucleotidesencoding E137 are underlined) is:

(SEQ ID NO: 731) AACTATTCTTCCATAATCACCGTCCATCCAGAGGTCATCGATGGGAGACCCGGTACAATGGTGATCGAATCATTTGTGGTGGATGTGCCTGATGGGAACACCAGGGATGAAACTTGTTACTTTGTGGAGGCTTTGATCAGGTGTAACCTAAGCTCTTTAGCTGATGTCTCAGAGAGGATGGCCGTGCAAGGTCGAACCAA TCCTATCAACCATTAA.

Amino acid sequence encoded by this region with E137 underlined which ismutated to L to increase GmPYL9 sensitivity to ABA is

(SEQ ID NO: 732) NYSSIITVHPEVIDGRPGTMVIESFVVDVPDGNTRDETCYFVEALIRCNLSSLADVSERMAVQGRTNPINH*.

Cas9/guide RNA with cutting site at 2902 bp downstream of TSS isGGUGAUCGAAUCAUUUGUGGGUUUUAGAGCUAUGCU (SEQ ID NO:733).

Donor template consisting E137 mutated to L with flanking regions of 100bp (underlined) is

(SEQ ID NO: 734) GGTTTAATACTCAATCATGTTGTGGAATTTGCAGAACTATTCTTCCATAATCACCGTCCATCCAGAGGTCATCGATGGGAGACCCGGTACAATGGTGATCCTATCATTTGTGGTGGATGTGCCTGATGGGAACACCAGGGATGAAACTTGTTACTTTGTGGAGGCTTTGATCAGGTGTAACCTAAGCTCTTTAGCTGATG TC.

As an exemplary assay for readout, CRISPR amplicon sequencing can beused to confirm the GA to CT mutation. Seedlings of edited plants can betreated with 10 uM ABA for 1 hour and then the expression level of ABAinducible genes can be measured (higher in edited seedlings compared tothe wild-type control).

Example 19

Soybean is a short-day crop with high protein and oil contents. Manycultivars have been bred with different maturity to adapt to variousenvironments. Flowering and maturity are highly controlled by majorgenes in soybean. Up to now, nine maturity loci have been identified asE1-E8 and J. E2 (GeneID: 100800578) is an orthologue of Arabidopsisflowering gene GIGANTEA. E2 delays flowering and maturity under long daylength condition through downregulating GmFT2a and GmFT5a. Recessive e2promote flowering and maturity (doi:10.1534/genetics.110.125062). Highermethylation at the promoter of E2 gene can reduce its expression andgenerate a weaker epiallele for early flowering and maturity.

Constructs for dCas9-SunTag and anti-GCN4 scFv fused to proteinsinvolved in RNA-directed DNA methylation, for example, DEFECTIVE INMERISTEM SILENCING3 (DMS3), the de novo DNA methyltransferase DOMAINSREARRANGED METHYLTRASFERASE2 (DRM2) or tobacco DRM catalytic domain(doi:10.1016/j.cell.2014.03.056) are used to target E2 gene promoter forincreasing methylation.

E2 gene promoter (539 bp, including 300 bp upstream of TSS and 5′-UTR inuppercase) is:

(SEQ ID NO: 735) aaattctaaaaataaaaaaaataatttataaggataaaaaaattaaatcataaaaatcaaaaagatatttcaaccttaacgttaaatacatatgtaataatacagttagaattacaaataaggtgactgcagtagaatatatgtgtaatgacgatggatgggcatccatgtccatgataaagatggatggacatgaacatacatacatatgtgtaagagtagcatgacgtgacacggggtaggactcaggaggtgagaggaaaatatcattgccacgtattcaatcaatctaggcttctcAGAGCTGCTAAAGATCTTCTCTTTCTTTCTCTCTCCCAGTTGTGTTCTCTTCTCTAGTTTGTTTCTGCAGTTTTGCCTCTCTCTCTCTCTCTCTCACTATCTATCTATCCCACCATAACCATTAACCACCACCTCTTAAATATTTTTCCACAAATCACCAAAATTTCCCATTTTTTTCACCCTCTGAATCACAATTTTTTTCTTTCTAACTAAAATCGCCTCTACACACAAGGATTCAG.

Three guide RNAs targeting region in the promoter and 5′-UTR regions ofE2 gene are designed to bring DNA methyltransferase or DMS3 to increasepromoter methylation level. The crRNA sequences areGAGUAGCAUGACGUGACACGGUUUUAGAGCUAUGCU (SEQ ID NO:736),GCCUAGAUUGAUUGAAUACGGUUUUAGAGCUAUGCU (SEQ ID NO:737), andCUAGAGAAGAGAACACAACUGUUUUAGAGCUAUGCU (SEQ ID NO:738).

As an exemplary assay for readout, QPCR can be used to check thetranscript level of E2 gene and bisulfate sequencing can be used todetermine the targeted methylation region in the promoter and 5′-UTR.

Example 20

This example describes the preparation of reagents for the modificationof three genes in a soybean plant cell to provide increased nitrogen useefficiency.

Soybean protoplasts are prepared as described in Examples 1-5.Preparation of reagents for gene editing is carried out using proceduressimilar to those described in Examples 6-9.

An increase in expression of NRT (GLYMA_12 G078900) is predicted toincrease nitrogen use efficiency (NUE). The sequence of NRT is shown asAATTTAATCTAATGGTAGATAATGTGTTCAAAGGAACGCTTGATAACATTTCTCGTGATAAATACGTATTTATGAGACTATTTAGTTATGATCATCCA

AAAGTAA TGATCATGTGCCAAGTTGCCACCCATAATTTATCTCAAAATTAATGAAACCCAAATAAAGGCGTTGAATAATACCACCATACAAAAGTGTGTTATTTAGCAGCATATGTAACTAGGCATATATCTATCTGTATATATGAGAGTTGATTATGTGTCACATAT

GGGTTC TTTTTGGCATACGCGGCGAAATGGATTACGTCAAATACAGCTTTTGTTTAATGCTTAAAGCTTTGGCAGCCGATGGAAATTTCATTGGCATTGTCAACGCCTTCCCCTACTATAAGTACAATCAC ACTCCT

AGGCCTTCAATTTGGTTTTGTTTCATCAGTTTTCCAGATACAGCACATTGATTGTTAAGGCGAAATGGCTGATATTGAGGGTT (SEQ ID NO:739). Similarto the previous examples, a nitrogen-responsive element (NRE) sequencehaving the sequence of AGAAACAACTTGACCCTTTACATTGCTCAAGAGCTCATCTCTT (SEQID NO:740) and encoded by a polynucleotide (such as a double-strandedDNA, a single-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule is designed for insertionat a double-strand break effected between nucleotides at positions101/102, 303/304, and 446/447 of SEQ ID NO:739 in order to enhance theexpression of NRT. Three NRT crRNAs are designed to target thenucleotides shown in bold italic in SEQ ID NO:739 and have the sequencesof AGUGUUGUGAGGGAGAGACAGUUUUAGAGCUAUGCU (SEQ ID NO:741),GAACCUUUGAGACAUACCAUGUUUUAGAGCUAUGCU (SEQ ID NO:742), andGGGUUGGAAAUUAAUUGACAGUUUUAGAGCUAUGCU (SEQ ID NO:743). All crRNAs andtracrRNA are purchased from Integrated DNA Technologies, Coralville,Iowa Ribonucleoprotein (RNP) complexes are prepared using gRNA(crRNA:tracrRNA) and Cas9 nuclease (Aldevron, Fargo, N. Dak.).

An increase in expression of NRT and NRT2 (Glyma13g39850.1)simultaneously is predicted to further increase nitrogen use efficiency(NUE). A partial genomic sequence of NRT2 is provided asTTGTTTACTCCTAGTTATTATCTTAAAAAAATTGAATCATATAATTATATATTAAGTTTTGAATATGTGTTTCCATCTTATAGTTTATGAGATTACCA

CTACAAAC TTTAAAAGTAAGCAGTAGATACATAATAGTTTTATAGGCCTGGTTGGTTAGC

CGGATAATGAACCCCAATGATGAAAACATGCAGACGCATGTTGCAGCATGGAAGTATTTTATTTAATAAGAATAATAATAAGGTAAGTGGTAGTAATTAAATTCCATATTCAGTATCATGGGAAATGAGATTCTTTGCCTTTGGGATACACCATTAGGCTTTTAGCCGTTCCA

GCATTACTCCATGGCCCTTGGGAATCCACTTGCCTCCTATCAGACTCTTACGTAGTCAACGCCTTCGCCTACTATAAAAACAC (SEQ ID NO:744), anitrogen-responsive element (NRE) sequence having the sequence of SEQ IDNO:740 and encoded by a polynucleotide (such as a double-stranded DNA, asingle-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule is designed for insertionat a double-strand break effected between nucleotides at positions101/102, 195/196, and 374/375 of SEQ ID NO:744 in order to enhance theexpression of NRT2. Three NRT2 crRNAs are designed to target thenucleotides shown in bold italic in SEQ ID NO:744 and have the sequencesof AUUUCGCCGCAUAUACACAGGUUUUAGAGCUAUGCU (SEQ ID NO:745),UGAAAUUUACAGCUACUACGGUUUUAGAGCUAUGCU (SEQ ID NO:746), andAUCCCAAUCUGUUAAACACAGUUUUAGAGCUAUGCU (SEQ ID NO:747). All crRNAs andtracrRNA are purchased from Integrated DNA Technologies, Coralville,Iowa Ribonucleoprotein (RNP) complexes are prepared using gRNA(crRNA:tracrRNA) and Cas9 nuclease (Aldevron, Fargo, N. Dak.).

An increase in expression of NRT, NRT2 and glutamine synthase (GS,Glyma.07G104500) simultaneously can even further increase nitrogen useefficiency (NUE). Constitutive overexpression of GS has been shown toresult in increased photosynthesis under low nitrate conditions (see, e.g., doi:10.1104/pp. 020013). In this example, the expression of GS isconstitutively increased by inserting a constitutive enhancer sequence.A partial genomic sequence of GS is provided asCAAAAATTAATTCTTTTTAGTAATGATAGAATCTAATATCTTAATTCAATGATTAATTATAACTTAAGTCTTCCTTTAAAATAAATCTCATCTCATCTCCT

ATCTCAT CTCATTCTTCGGTGATCAAATCTAGTGCCAGTACCGTACTTGGTACGCTACCTTCACTTGCCT

ACCTACCTTTCATAATTTAATATAAAAAATAAATAAACAATGTCGCTGCAAAGCATGTTCATGTTCATTAATTCATTTTTATTATTAAAAAAAAAACACCCCTTTATT

CGGTATCTTTCCACCACTTTCTTTATCTTTAGAGATCTTCTTTTATATATATATATATATATAGATAGATAGATAGATAGATACAGAGATGAAAAATACT (SEQ ID NO:748).A maize OCS homologue encoded by a chemically modified single-strandedDNA with the sequence of GTAAGCGCTTAC (SEQ ID NO:749, Integrated DNATechnologies, Coralville, Iowa), phosphorylated on the 5′ end andcontaining two phosphorothioate linkages at each terminus (i.e., the twolinkages between the most distal three bases on either end of thestrand) is designed for insertion at a double-strand break effectedbetween nucleotides at positions 103/104, 193/194, and 331/332 of SEQ IDNO: 748 in order to provide constitutively increased expression of GS. AGS crRNA is designed to target the nucleotides shown in bold italic inSEQ ID NO:748 and has the sequences ofGUGAUAGCUGAUAAGCACAUGUUUUAGAGCUAUGCU (SEQ ID NO:750),UUAGGCGGCGGAAAAACUCAGUUUUAGAGCUAUGCU (SEQ ID NO:751), andUCUCUCUCAAAAAAGGAAGAGUUUUAGAGCUAUGCU (SEQ ID NO:752). The crRNA andtracrRNA are purchased from Integrated DNA Technologies, Coralville,Iowa Ribonucleoprotein (RNP) complexes are prepared using gRNA(crRNA:tracrRNA) and Cas9 nuclease (Aldevron, Fargo, N. Dak.).

Example 21

This example describes the modification of two soybean genes to makeearly flowering soybeans.

FT2a and E2 regulate flowering time in soybean. The gene modificationfor increasing the expression of FT2a is described in Example 14. Thegene modification for making the epigenetic change of E2 is described inExample 19.

As an exemplary assay for readout, QPCR can be used to check thetranscript level of FT2a and E2. The edited plants are measured forflowering time.

Example 22

It is predicted that modification of NRT, NRT2, and GS in soybean willresult in soybean cells with increased nitrogen use efficiency (NUE),and, further that the additional modification of FT2a will result inearly flowering, higher yielding soybean plants (see, e. g., U.S. PatentApplication Publication 20160304891 A1, incorporated herein byreference).

Soybean protoplasts are prepared as described in Examples 1-5.Preparation of reagents is completed essentially as described inExamples 6-10.

An increase in expression of NRT (GLYMA_12 G078900) is predicted toincrease nitrogen use efficiency (NUE). The sequence of NRT is shown asSEQ ID NO:739. Similar to the previous examples, a nitrogen-responsiveelement (NRE) sequence having the sequence of SEQ ID NO:740 and encodedby a polynucleotide (such as a double-stranded DNA, a single-strandedDNA, a single-stranded DNA/RNA hybrid, or a double-stranded DNA/RNAhybrid) donor molecule is designed for insertion at a double-strandbreak effected between nucleotides at positions 101/102, 303/304, and446/447 of SEQ ID NO:739 in order to enhance the expression of NRT.Three NRT crRNAs are designed to target the nucleotides shown in bolditalic in SEQ ID NO:739 and have the sequences of SEQ ID NOs:741, 742,and 743. All crRNAs and tracrRNA are purchased from Integrated DNATechnologies, Coralville, Iowa Ribonucleoprotein (RNP) complexes areprepared using gRNA (crRNA:tracrRNA) and Cas9 nuclease (Aldevron, Fargo,N. Dak.).

An increase in expression of NRT and NRT2 (Glyma13g39850.1)simultaneously is predicted to further increase nitrogen use efficiency(NUE). A partial genomic sequence of NRT2 is provided as SEQ ID NO:744,a nitrogen-responsive element (NRE) sequence having the sequence of SEQID NO:740 and encoded by a polynucleotide (such as a double-strandedDNA, a single-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule is designed for insertionat a double-strand break effected between nucleotides at positions101/102, 195/196, and 374/375 of SEQ ID NO:744 in order to enhance theexpression of NRT2. Three NRT2 crRNAs are designed to target thenucleotides shown in bold italic in SEQ ID NO:744 and have the sequencesof SEQ ID NOs: 745, 746, and 747. All crRNAs and tracrRNA are purchasedfrom Integrated DNA Technologies, Coralville, Iowa Ribonucleoprotein(RNP) complexes are prepared using gRNA (crRNA:tracrRNA) and Cas9nuclease (Aldevron, Fargo, N. Dak.).

An increase in expression of NRT, NRT2 and glutamine synthase (GS,Glyma.07G104500) simultaneously can even further increase nitrogen useefficiency (NUE). Constitutive overexpression of GS has been shown toresult in increased photosynthesis under low nitrate conditions (see, e.g., doi:10.1104/pp. 020013). In this example, the expression of GS isconstitutively increased by inserting a constitutive enhancer sequence.A partial genomic sequence of GS is provided as SEQ ID NO:748. A maizeOCS homologue encoded by a chemically modified single-stranded DNA withthe sequence of SEQ ID NO:749 (Integrated DNA Technologies, Coralville,Iowa), phosphorylated on the 5′ end and containing two phosphorothioatelinkages at each terminus (i. e., the two linkages between the mostdistal three bases on either end of the strand) is designed forinsertion at a double-strand break effected between nucleotides atpositions 103/104, 193/194, and 331/332 of SEQ ID NO: 748 in order toprovide constitutively increased expression of GS. A GS crRNA isdesigned to target the nucleotides shown in bold italic in SEQ ID NO:748and has the sequences of SEQ ID NO: ID NO:750, 751, and 752. The crRNAand tracrRNA are purchased from Integrated DNA Technologies, Coralville,Iowa Ribonucleoprotein (RNP) complexes are prepared using gRNA(crRNA:tracrRNA) and Cas9 nuclease (Aldevron, Fargo, N. Dak.).

FT2a (Glyma.16G150700) is the mobile flowering trigger in soybean and anincrease in expression of FT2a is anticipated to trigger flowering.Early flowering is not normally a desirable phenotype as early-floweringplants do not maintain high vegetative growth rates, resulting inoverall lower yields. It is predicted that a short burst of FT2aexpression will be sufficient to trigger flowering while allowing theplants to maintain vegetative growth, resulting in an everbearing andhigh-yielding phenotype. Thus, in addition to the increased nitrogenutilization efficiency achieved by modification of NRT, NTR2, and GS, anauxin-inducible element is integrated in the promoter of the FT2a gene.A partial genomic sequence of FT2a is provided asAAAGAAGCTATGAGGTGCAAGAACCGATCACATGGAGAAGGCAATGAAAGACAAGGAGGAGCAATGGAAGAGAGAAAATGAGAAGATGGAAGGGATGT

AGG TGATCAGTTTTAAAATACGAATTTAGTATTTTCTTTTTAAGAAAATTCTTTCGGAAAGTCGTGTTTTAAAACATGACTTTTATTTATTTGAAGTCGTGTTCTAAAACATGACTTTATTTCATATCCTTTAATATTTTATATCCTTAATATTTTTAAAATTTATCCATTTGTAATATTTTTTAAAAATTGACCCATATATGTAAAATACCC

TTGAAAGCGAAAGCATATCACTTC AAACACAATGGAATCGAGGCTATTGACTAAGTATAA

GGGGTTC ATAATTCATAACAAAGCAAACGAGTATATAAGAAAGCATAAGCCAAATTTTGAGTAAACTAGTGTGCACACTATCCC (SEQ ID NO:753). The auxin-responsive element 3×DR5with the sequence of GCUCCUCACUAGCUACCAAGGUUUUAGAGCUAUGCU (SEQ IDNO:754) is provided as a polynucleotide (e. g., as a double-strandedDNA, a single-stranded DNA, a single-stranded DNA/RNA hybrid, or adouble-stranded DNA/RNA hybrid) donor molecule for insertion at adouble-strand break effected between nucleotides at positions 115/116,334/335, and 428/429 of SEQ ID NO:753. A FT2a crRNA is designed totarget the nucleotides shown in bold italic in SEQ ID NO: 753 and hasthe sequences of GAAAAUGUUUGAAAAAAACGGUUUUAGAGCUAUGCU (SEQ ID NO:755),AUAGAGAAGACUUCAUAUCGGUUUUAGAGCUAUGCU (SEQ ID NO:756) andAAUAAUAAAGAGAUCUUGACGUUUUAGAGCUAUGCU (SEQ ID NO:757). The crRNA andtracrRNA are purchased from Integrated DNA Technologies, Coralville,Iowa Ribonucleoprotein (RNP) complexes are prepared using gRNA(crRNA:tracrRNA) and Cas9 nuclease (Aldevron, Fargo, N. Dak.).

Example 23

This example describes additional genomic modifications that furtherenhance the effects of the modifications of soybean genes described inExample 22. E1 (Glyma.06G207800.1) is a large effect flowering time genein soybean and has been reported to be a repressor of two genes involvedin the induction of flowering, FT2a and FT5a (see, e. g.,DOI:10.1104/pp. 15.00763). It is predicted that by stacking theinducible increased expression of FT2a described in Example 22 with amodest decrease in expression of E1 will result in early flowering withincreased yield outcomes. In this example, a SAUR mRNA destabilizingsequence is integrated in the 3′ untranslated region (3′ UTR) of the E1gene. SAUR destabilizing sequences result in reduced expression due toincreased mRNA degradation (see, e. g., doi:10.1105/tpc.5.6.701, and U.S. Patent Application Publication 2007/0011761, incorporated herein byreference). A partial genomic sequence of E1 is provided asATCGGATTTCATTGGGATCCATATAATTGCGTTTTCAATTTCTGTGTCCTTAAACAAGCTATGCCAGAGAATTAATTTAATTTTAAGTGTTAGCTTTATT

AGGAAA ACAATGGCCTATATATTATTCCT

TTATTGCAATAGCGTGTACTTCAACCTAATTATTTAATACCAAGTTTCTATATTAATGTTGTATCTTATGAAATCCTTCTATTTTCCATTCTATAAATTA (SEQ ID NO:758). A SAUR mRNA destabilizing element withthe sequence ofAGATCTAGGAGACTGACATAGATTGGAGGAGACATTTTGTATAATAAGATCTAGGAGACTGACATAGATTGGAGGAGACATTTTGTATAATA (SEQ ID NO:759) is designed forinsertion at a double-strand break effected between nucleotides atpositions 117/118 or 152/153 of SEQ ID NO:758. The SAUR destabilizingelement in the form of a single-stranded DNA molecule, phosphorylated onthe 5′ end and containing two phosphorothioate linkages at each terminus(i. e., the two linkages between the most distal three bases on eitherend of the strand) is purchased from Integrated DNA Technologies,Coralville, Iowa A E1 crRNA is designed to target the nucleotides shownin bold italic in SEQ ID NO:758 and has the sequences ofAUUUUACUUUCAAAUCAUUGGUUUUAGAGCUAUGCU (SEQ ID NO:760) andCAUUAUUGUAUGUUACAUAUGUUUUAGAGCUAUGCU (SEQ ID NO:761). The E1 crRNA andtracrRNA are purchased from Integrated DNA Technologies, Coralville,Iowa Ribonucleoprotein (RNP) complexes are prepared using gRNA(crRNA:tracrRNA) and Cas9 nuclease (Aldevron, Fargo, N. Dak.).

Example 24

This example describes the modification of two soybean genes to increasedrought resistance and to enhance determinant growth.

Modification of GmPYL9 gene and dt1 gene together can increase droughtresistance and enhance determinant growth. The modification of GmPYL9gene is described in Example 18. The modification of dt1 gene isdescribed in Example 13.

As an exemplary assay for readout, QPCR can be used to check thetranscript level of GmPYL9 and dt1. The edited plants can be tested fordrought resistance and determinant growth.

Example 25

Soy productivity can be improved by targeting processes associated withnon-photochemical quenching (NPQ). This is a strategy to increasephotosynthetic efficiency. Orthologous genes to those targeted inKromdjik et al. (2016, Science, 354 (6314): 857-61) can be up-regulatedby inserting an enhancer element in the promoter proximal region of thesoybean orthologous genes for the chloroplastic photosystem II 22 kDaprotein (PsbS), violaxanthin de-epoxidase (VDE) and zeoxanthin epoxidase(ZEP). Upregulation of each individual gene in model plants has a low tomarginal effect on photosynthetic efficiency (Hubbart et al., 2012, ThePlant Journal: For Cell and Molecular Biology, 71 (3): 402-12; Leonelliet al., 2016, The Plant Journal: For Cell and Molecular Biology, 88 (3):375-86). The combined effect of up-regulating all three genes produces amuch larger effect. Kromdjik et al. (2016) demonstrated this in tobaccoby inserting transgenes driven by highly active green tissue-preferredpromoters. Here we demonstrate how this can be done using gene editingtechnology.

There is scattered evidence for ‘enhancer’ elements that function inplants, and may be applicable here (Marand et al., 2017, Plant GeneRegulatory Mechanisms and Networks, 1860 (1): 131-39). Most of theheavily used cis-enhancer elements come from plant pathogens. Forexample, there are well characterized virus-derived enhancer elementsthat work well in crops like maize (Davies et al., 2014, BMC PlantBiology, 14 (December): 359) but these are less desirable due to theirlength (generally 1-200 bp). Technology to scan genomes for putativeregulatory elements that are ˜20 bp in length is available. Most of thework has been done in Arabidopsis (Burgess et al., 2015, Current Opinionin Plant Biology, 27 (October): 141-47) and this remains a largelytheoretical area. Here we use the nopaline synthase OCS element(5′-ACGTAAGCGCTTACGT-3, SEQ ID NO:210), Ellis et al., 1987, The EMBOJournal, 6 (11): 3203-8) to upregulate the three soybean NPQ genes.G-box (5′-/Phos/G*C* CAC GTG CCG CCA CGT GCC GCC ACG TGC CGC CAC GTG*C*C-3′, SEQ ID NO:211) and green tissue-specific promoter (GSP,5′-/Phos/A*A*AATATTTATAAAATATTTATAAAATATTTATAAAATATTT*A*T-3′, SEQ IDNO:212) can also be used to upregulate the NPQ genes.

Gene editing strategies for increasing these genes expression byinsertion of enhancer element in the promoter of the member of eachfamily are listed in the Table 8 below.

TABLE 8 Gene Change in gene family Member Gene ID activity Elementinserted PSBS Glyma.06G113200 100779417 Increase expression OCS, G-box,or GSP Glyma.04G249700 100807355 Increase expression OCS, G-box, or GSPZEP Glyma.11g055700 100820171 Increase expression OCS, G-box, or GSPGlyma.17G174500 100800186 Increase expression OCS, G-box, or GSP VDEGlyma.03G253500 100816085 Increase expression OCS, G-box, or GSPGlyma.19G251000 100778118 Increase expression OCS, G-box, or GSP

An appropriate guide RNA is designed to target Cas9 or its equivalent toone or more sites within 500 bp of each gene's transcription start site.The site-directed endonuclease can be delivered as a ribonucleoprotein(RNP) complex or encoded on plasmid DNA. Synthetic oligonucleotidesrepresenting 1-5 copies of the OCS element (Ellis et al. 1987) areco-delivered with the site-specific endonuclease. The design of eachguide RNA can be developed and optimized in a soybean protoplast system,using qRT-PCR of mRNA from each target gene to report successfulup-regulation. Alternatively, a synthetic promoter for each target gene,linked to a suitable reporter gene like eGFP, can be co-transfected withthe site-specific RNPs and enhancer oligonucleotide into protoplasts toevaluate the efficacy of all possible guide RNAs to identify the mosteffective candidates for delivery to whole plant tissues. The mosteffective RNPs are used to insert the NPQ technology in soybean.

One of many published methods for introducing the OCS enhancer elementinto the predetermined target sites of the soybean NPQ genes can used togenerate lines with enhanced photosynthetic activity. These includeagrobacterium, viral or biolistic-mediated approaches (Senapati, 2016(doi:10.9734/ARRB/2016/22300)). The RNPs and OCS element can also beintroduced by microinjection.

Soybean tissues that are appropriate for genome editing manipulationinclude embryogenic callus, exposed shoot apical meristems and 1 DAPembryos. There are many approaches to producing embryogenic callus (Leeet al., 2013 (doi:10.5772/51076), Homrich et al., 2012(doi:10.1590/S1415-47572012000600015), Maheshwari and Kovalchuk, 2016(doi:10.1016/B978-1-893997-98-1.00014-2), Finer, 2016(doi:10.1002/cppb.20039)). Shoot apical meristem explants can beprepared using a variety of methods in the art (Sticklen and Oraby, 2005(doi:10.1079/IVP2004616), Baskaran and Dasgupta, 2012(doi:10.1007/s13562-011-0078-x), Senapati, 2016(doi:10.9734/ARRB/2016/22300)). This work describes how to prepare andnurture material that is adequate for microinjection.

To prepare soy explants for microinjection, soy seed sterilized, imbibedand dissected as described in Yang et al., 2016(doi:10.1007/s11738-016-2081-2) or Luth et al 2015(doi:10.1007/978-1-4939-1695-5_22). The cotyledon sections can becultivated with or without the cytokinin, 6-benzylaminopurine (BAP)which produces de novo vegetative buds (Buising 1992(lib.dr.iastate.edu/rtd/9821). The 2-3 mm embryonic axes are placed onthe appropriate solid support medium and oriented for easy access usinga microinjection needle.

Microinjection is used to target specific cells in the shoot apicalmeristem. For example, an injector attached to a Narashige manipulatoron a dissecting microscope is adequate for injecting relatively largecells (e.g., the egg/synergids/zygote and the central cell). For smallercells, such as those of the embryo or shoot apical meristem, a compound,inverted microscope with an attached Narashige manipulator can be used.Injection pipette diameter and bevel are also important. Use ahigh-quality pipette puller and beveler to prepare needles with adequatestrength, flexibility and pore diameter. These will vary depending onthe cargo being delivered to cells. The volume of fluid to bemicroinjected must be exceedingly small and must be carefullycontrolled. An Eppendorf Transjector yields consistent results (Laurieet al., 1999).

The genetic cargo can be RNA, DNA, protein or a combination thereof. Forexample to introduce the NPQ trait the cargo consists of the RNPstargeting the promoter proximal region of the soybean PsBS, VDE and ZEPgenes, and the OCS enhancer element. The concentration of each cargocomponent will vary depending on the nature of the manipulation. Typicalcargo volumes can vary from 2-20 nanoliters. After microinjection,treated plant parts are maintained on an appropriate media alone orsupplemented with a feeder culture. Plantlets are transferred to freshmedia every two weeks and to larger containers as they grow. Plantletswith a well-developed root system are transferred to soil and maintainedin high-humidity for 5 days to acclimate. Plants are gradually exposedto the air and cultivated to reproductive maturity.

Several molecular techniques can be used to confirm that the intendededits are present in the treated plants. These include sequence analysisof each target site and qRT-PCR of each target gene. The NPQ trait isconfirmed using photosynthesis instrument such as a Ciras 3 or Licor6400, to compare modified plants to unmodified or wildtype plants. TheNPQ trait is expected to increase biomass accumulation relative towildtype plants.

Example 26

Table 10 provides a list of genes in soybean associated with varioustraits including abiotic stress resistance, plant architecture, bioticstress resistance, photosynthesis, and resource partitioning. Withineach trait, various non-limiting (and often overlapping) sub-categoriesof traits may be identified, as presented in the Table. For example,“abiotic stress resistance” may be related to or associated with changesin abscisic acid (ABA) signaling, biomass, cold tolerance, droughttolerance, tolerance to high temperatures, tolerance to lowtemperatures, and/or salt tolerance; the trait “plant architecture” maybe related to or may include traits such as biomass, fertilization,flowering time and/or flower architecture, inflorescence architecture,lodging resistance, root architecture, shoot architecture, leafarchitecture, and yield; the trait “biotic stress” may include diseaseresistance, insect resistance, population density stress and/or shadingstress; the trait “photosynthesis” can include photosynthesis andrespiration traits; the trait “resource partitioning” can include or berelated to biomass, seed weight, drydown rate, grain size, nitrogenutilization, oil production and metabolism, protein production andmetabolism, provitamin A production and metabolism, seed composition,seed filling (including sugar and nitrogen transport), and starchproduction and metabolism. By modifying one or more of the associatedgenes in Table 10, each of these traits may be manipulated singly or incombination to improve the yield, productivity or other desired aspectsof a soybean plant comprising the modification(s).

Each of the genes listed in Table 10 may be modified using any of thegene modification methods described herein. In particular, each of thegenes may be modified using the targeted modification methods describedherein which introduce desired genomic changes at specific locations inthe absence of off-target effects. Even more specifically, each of thegenes in Table 10 may be modified using the CRISPR targeting methodsdescribed herein, either singly or in multiplexed fashion. For example,a single gene in Table 10 could be modified by the introduction of asingle mutation (change in residue, insertion of residue(s), or deletionof residue(s)) or by multiple mutations. Another possibility is that asingle gene in Table 10 could be modified by the introduction of two ormore mutations, including two or more targeted mutations. In addition,or alternatively, two or more genes in Table 10 could be modified (e.g.,using the targeted modification techniques described herein) such thatthe two or more genes each contains one or more modifications.

The various types of modifications that can be introduced into the genesof Table 10 have been described herein. The modifications include bothmodifications to regulatory regions that affect the expression of thegene product (i.e., the amount of proteins or RNA encoded by the gene)and modifications that affect the sequence or activity of the encodedprotein or RNA (in some cases the same modification may affect both theexpression level and the activity of the encoded protein or RNA).Modifying genes encoding proteins with the amino acid sequences listedin Table 10 (e.g., SEQ ID Nos: 456-495, 497-530, 535-646, 648-65 and656), or sequences with at least 95%, 96%, 97%, 98%, or 99% identity tothe protein sequences listed in Table 10, may result in proteins withimproved or diminished activity. Methods of alignment of sequences forcomparison are well known in the art. Two examples of algorithms thatare suitable for determining percent sequence identity and sequencesimilarity are the BLAST and BLAST 2.0 algorithms, which are describedin Altschul et al., (1977) Nuc. Acids Res. 25:3389-3402; and Altschul etal., (1990) J. Mol. Biol. 215:403-410, respectively. Software forperforming BLAST analyses is publicly available through the NationalCenter for Biotechnology Information. Optimal alignment of sequences forcomparison can also be conducted, e.g., by the local homology algorithmof Smith and Waterman, (1970) Adv. Appl. Math. 2:482c, by the homologyalignment algorithm of Needleman and Wunsch, (1970) J. Mol. Biol.48:443, by the search for similarity method of Pearson and Lipman,(1988) Proc. Nat'l. Acad. Sci. USA 85:2444, by computerizedimplementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA inthe Wisconsin Genetics Software Package, Genetics Computer Group, 575Science Dr., Madison, Wis.), or by manual alignment and visualinspection (see, e.g., Brent et al., (2003) Current Protocols inMolecular Biology).

In some cases, targeted modifications may result in proteins with noactivity (e.g., the introduction of a stop codon may result in afunctionless protein) or a protein with a new activity or feature.Similarly, the sequences of the miRNAs listed in Table 10 may also bemodified to increase, decrease or reduce their activity.

In addition to modifications to the encoded proteins or miRNAs, each ofthe gene sequences listed in Table 10 (e.g., the gene sequences of SEQIDs 235-396 and 398-435) may also be modified, individually or incombination with one or more other modifications, to alter theexpression of the encoded protein or RNA, or to alter the stability ofthe RNA encoding the corresponding protein sequence. For example,regulatory sequences affecting transcription of any of the genes listedin Table 10, including primer sequences and transcription factor bindingsequences, can be modified, or introduced into, any of the genesequences listed in Table 10 using the methods described herein. Themodifications can effect increases in transcription levels, decreases intranscription levels, and/or changes in the timing of transcription ofthe genes under control of the modified regulatory regions. Targetedepigenetic modifications affecting gene expression may also beintroduced.

One skilled in the art will be able to identify regulatory regions(e.g., regulatory regions in the gene sequences provided in Table 10)using techniques described in the literature, e.g., Bartlett A. et al,“Mapping genome-wide transcription-factor binding sites using DAP-seq.”,Nat Protoc. (2017) August; 12(8):1659-1672; O'Malley R. C. et al,“Cistrome and Epicistrome Features Shape the Regulatory DNA Landscape”,Cell (2016) September 8; 166(6):1598; He Y et al, “Improved regulatoryelement prediction based on tissue-specific local epigenomicsignatures”, Proc Natl Acad Sci USA, (2017) February 28;114(9):E1633-E1640; Zefu Lu et al, “Combining ATAC-seq with nucleisorting for discovery of cis-regulatory regions in plant genomes”,Nucleic Acids Research, (2017) V45(6); Rurika Oka et al, “Genome-widemapping of transcriptional enhancer candidates using DNA and chromatinfeatures in maize”, Genome Biology (2017) 18:137. In addition, Table 9,provided herein, lists sequences which may be inserted into or otherwisecreated in the genomic sequences of Table 10 for the purpose ofregulating gene expression and/or transcript stability.

The stability of transcribed RNA encoded by the genes in Table 10 (inaddition to other genes), can be increased or decreased by the targetedinsertion or creation of the transcript stabilizing or destabilizingsequences provided in Table 9 (e.g., SEQ ID Nos. 198-200, 202 and 214).In addition, Table 10 includes miRNAs whose expression can be used toregulate the stability of transcripts comprising the correspondingrecognition sites. Additional miRNAs, miRNA precursors and miRNArecognition site sequences that can be used to regulate transcriptstability and gene expression in the context of the methods describedherein may be found in, e.g., U.S. Pat. No. 9,192,112 (e.g., Table 2),U.S. Pat. Nos. 8,946,511 and 9,040,774, the disclosures of each of whichare incorporated by reference herein.

Any of the above regulatory modifications may be combined to regulate asingle gene, or multiple genes, or they may be combined with thenon-regulatory modifications discussed above to regulate the activity ofa single modified gene, or multiple modified genes. The geneticmodifications or gene regulatory changes discussed above may affectdistinct traits in the soybean cell or plant, or they make affect thesame trait. The resulting effects of the described modifications on atrait or trait may be additive or synergistic. Modifications to thesoybean genes listed in Table 10 may be combined with modifications toother sequences in the soybean genome for the purpose of improving oneor more soybean traits. The soybean sequences of Table 10 may also bemodified for the purpose of tracking the expression or localizing theexpression or activity of the listed genes and gene products.

TABLE 9 Type of regulation Name of element Sequence SEQ IDregulatory control ABFs binding site CACGTGGC SEQ ID by TF motif NO: 19regulatory control ABRE binding site (C/T)ACGTGGC SEQ ID by TF motifNO: 20 regulatory control ABRE-like binding (C/G/T)ACGTG(G/T)(A/C)SEQ ID by TF site motif NO: 21 regulatory control ACE promoter motifGACACGTAGA SEQ ID by TF NO: 22 regulatory control AG binding siteTT(A/G/T)CC(A/T)(A/T)(A/T)(A/T)(A/T)(A/T)GG(A/C/T) SEQ ID by TF motifNO: 23 regulatory control AG binding site in CCATTTTTAGT SEQ ID by TFAP3 NO: 24 regulatory control AG binding site in CCATTTTTGG SEQ ID by TFSUP NO: 25 regulatory control AGLI binding siteNTT(A/G/T)CC(A/T)(A/T)(A/T)(A/T)NNGG(A/T)AAN SEQ ID by TF motif NO: 26regulatory control AGL2 binding siteNN(A/T)NCCA(A/T)(A/T)(A/T)(A/T)T(A/G)G(A/T)(A/T)AN SEQ ID by TF motifNO: 27 regulatory control AGL3 binding siteTT(A/T)C(C/T)A(A/T)(A/T)(A/T)(A/T)T(A/G)G(A/T)AA SEQ ID by TF motifNO: 28 regulatory control AP1 binding site in CCATTTTTAG SEQ ID by TFAP3 NO: 29 regulatory control AP1 binding site in CCATTTTTGG SEQ IDby TF SUP NO: 30 regulatory control ARF binding site TGTCTC SEQ ID by TFmotif NO: 31 regulatory control ARF1 binding site TGTCTC SEQ ID by TFmotif NO: 32 regulatory control ATHB1 binding site CAAT(A/T)ATTG SEQ IDby TF motif NO: 33 regulatory control ATHB2 binding site CAAT(C/G)ATTGSEQ ID by TF motif NO: 34 regulatory control ATHB5 binding siteCAATNATTG SEQ ID by TF motif NO: 35 regulatory controlATHB6 binding site CAATTATTA SEQ ID by TF motif NO: 36regulatory control AtMYB2 binding CTAACCA SEQ ID by TF site in RD22NO: 37 regulatory control AtMYC2 binding CACATG SEQ ID by TFsite in RD22 NO: 38 regulatory control Box II promoter GGTTAA SEQ IDby TF motif NO: 39 regulatory control CArG promoterCC(A/T)(A/T)(A/T)(A/T)(A/T)(A/T)GG SEQ ID by TF motif NO: 40regulatory control CArG1 motif in AP3 GTTTACATAAATGGAAAA SEQ ID by TFNO: 41 regulatory control CArG2 motif in AP3 CTTACCTTTCATGGATTA SEQ IDby TF NO: 42 regulatory control CArG3 motif in AP3 CTTTCCATTTTTAGTAACSEQ ID by TF NO: 43 regulatory control CBF1 binding site in TGGCCGACSEQ ID by TF cor15a NO: 44 regulatory control CBF2 binding site CCACGTGGSEQ ID by TF motif NO: 45 regulatory control CCA1 binding siteAA(A/C)AATCT SEQ ID by TF motif NO: 46 regulatory control CCA1 motif1AAACAATCTA SEQ ID by TF binding site in CAB1 NO: 47 regulatory controlCCA1 motif2 AAAAAAAATCTATGA SEQ ID by TF binding site in CAB1 NO: 48regulatory control DPBF1&2 binding ACACNNG SEQ ID by TF site motifNO: 49 regulatory control DRE promoter motif TACCGACAT SEQ ID by TFNO: 50 regulatory control DREB 1&2 binding TACCGACAT SEQ ID by TFsite in rd29a NO: 51 regulatory control DRE-like promoter(A/G/T)(A/G)CCGACN(A/T) SEQ ID by TF motif NO: 52 regulatory controlE2F binding site TTTCCCGC SEQ ID by TF motif NO: 53 regulatory controlE2F/DP binding site TTTCCCGC SEQ ID by TF in AtCDC6 NO: 54regulatory control E2F-varient binding TCTCCCGCC SEQ ID by TF site motifNO: 55 regulatory control EIL1 binding site in TTCAAGGGGGCATGTATCTTGAASEQ ID by TF ERF1 NO: 56 regulatory control EIL2 binding site inTTCAAGGGGGCATGTATCTTGAA SEQ ID by TF ERF1 NO: 57 regulatory controlEIL3 binding site in TTCAAGGGGGCATGTATCTTGAA SEQ ID by TF ERF1 NO: 58regulatory control EIN3 binding site in GGATTCAAGGGGGCATGTATCTTGAATCCSEQ ID by TF ERF1 NO: 59 regulatory control ERE promoter motifTAAGAGCCGCC SEQ ID by TF NO: 60 regulatory control ERF1 binding site inGCCGCC SEQ ID by TF AtCHI-B NO: 61 regulatory control EveningElementAAAATATCT SEQ ID by TF promoter motif NO: 62 regulatory controlGATA promoter (A/T)GATA(G/A) SEQ ID by TF motif NO: 63regulatory control GBF1/2/3 binding CCACGTGG SEQ ID by TF site in ADH1NO: 64 regulatory control G-box promoter CACGTG SEQ ID by TF motifNO: 65 regulatory control GCC-box promoter GCCGCC SEQ ID by TF motifNO: 66 regulatory control GT promoter motif TGTGTGGTTAATATG SEQ ID by TFNO: 67 regulatory control Hexamer promoter CCGTCG SEQ ID by TF motifNO: 68 regulatory control HSEs binding site AGAANNTTCT SEQ ID by TFmotif NO: 69 regulatory control Ibox promoter motif GATAAG SEQ ID by TFNO: 70 regulatory control JASE1 motif in CGTCAATGAA SEQ ID by TF OPR1NO: 71 regulatory control JASE2 motif in CATACGTCGTCAA SEQ ID by TF OPR2NO: 72 regulatory control L1-box promoter TAAATG(C/T)A SEQ ID by TFmotif NO: 73 regulatory control LS5 promoter motif ACGTCATAGA SEQ IDby TF NO: 74 regulatory control LS7 promoter motif TCTACGTCAC SEQ IDby TF NO: 75 regulatory control LTRE promoter ACCGACA SEQ ID by TF motifNO: 76 regulatory control MRE motif in CHS TCTAACCTACCA SEQ ID by TFNO: 77 regulatory control MYB binding site (A/C)ACC(A/T)A(A/C)C SEQ IDby TF promoter NO: 78 regulatory control MYB1 binding site(A/C)TCC(A/T)ACC SEQ ID by TF motif NO: 79 regulatory controlMYB2 binding site TAACT(G/C)GTT SEQ ID by TF motif NO: 80regulatory control MYB3 binding site TAACTAAC SEQ ID by TF motif NO: 81regulatory control MYB4 binding site A(A/C)C(A/T)A(A/C)C SEQ ID by TFmotif NO: 82 regulatory control Nonamer promoter AGATCGACG SEQ ID by TFmotif NO: 83 regulatory control OBF4, 5 binding siteATCTTATGTCATTGATGACGACCTCC SEQ ID by TF in GST6 NO: 84regulatory control OBP-1, 4, 5 binding TACACTTTTGG SEQ ID by TFsite in GST6 NO: 85 regulatory control OCS promoter motifTGACG(C/T)AAG(C/G)(A/G)(A/C)T(G/T)ACG(C/T)(A/C)(A/C) SEQ ID by TF NO: 86regulatory control octamer promoter CGCGGATC SEQ ID by TF motif NO: 87regulatory control PI promoter motif GTGATCAC SEQ ID by TF NO: 88regulatory control PII promoter motif TTGGTTTTGATCAAAACCAA SEQ ID by TFNO: 89 regulatory control PRHA binding site in TAATTGACTCAATTA SEQ IDby TF PAL1 NO: 90 regulatory control RAV1-A binding site CAACA SEQ IDby TF motif NO: 91 regulatory control RAV1-B binding site CACCTG SEQ IDby TF motif NO: 92 regulatory control RY-repeat promoter CATGCATG SEQ IDby TF motif NO: 93 regulatory control SBP-box promoter TNCGTACAA SEQ IDby TF motif NO: 94 regulatory control T-box promoter ACTTTG SEQ ID by TFmotif NO: 95 regulatory control TEF-box promoter AGGGGCATAATGGTAA SEQ IDby TF motif NO: 96 regulatory control TELO-box promoter AAACCCTAA SEQ IDby TF motif NO: 97 regulatory control TGA1 binding site TGACGTGG SEQ IDby TF motif NO: 98 regulatory control W-box promoter TTGAC SEQ ID by TFmotif NO: 99 regulatory control Z-box promoter ATACGTGT SEQ ID by TFmotif NO: 100 regulatory control AG binding site in AAAACAGAATAGGAAASEQ ID by TF SPL/NOZ NO: 101 regulatory control Bellringer/ AAATTAAASEQ ID by TF replumless/ NO: 102 pennywise binding site IN AGregulatory control Bellringer/ AAATTAGT SEQ ID by TF replumless/ NO: 103pennywise binding site 2 in AG regulatory control Bellringer/ ACTAATTTSEQ ID by TF replumless/ NO: 104 pennywise binding site 3 in AGregulatory control AGL15 binding site CCAATTTAATGG SEQ ID by TFin AtGA2ox6 NO: 105 regulatory control ATB2/AtbZIP53/ ACTCAT SEQ IDby TF AtbZIP44/GBF5 NO: 106 binding site in ProDH regulatory controlLFY binding site in CTTAAACCCTAGGGGTAAT SEQ ID by TF AP3 NO: 107regulatory control SORLREP1 TT(A/T)TACTAGT SEQ ID by TF NO: 108regulatory control SORLREP2 ATAAAACGT SEQ ID by TF NO: 109regulatory control SORLREP3 TGTATATAT SEQ ID by TF NO: 110regulatory control SORLREP4 CTCCTAATT SEQ ID by TF NO: 111regulatory control SORLREP5 TTGCATGACT SEQ ID by TF NO: 112regulatory control SORLIP1 AGCCAC SEQ ID by TF NO: 113regulatory control SORLIP2 GGGCC SEQ ID by TF NO: 114 regulatory controlSORLIP3 CTCAAGTGA SEQ ID by TF NO: 115 regulatory control SORLIP4GTATGATGG SEQ ID by TF NO: 116 regulatory control SORLIP5 GAGTGAG SEQ IDby TF NO: 117 regulatory control ABFs binding site CACGTGGC SEQ ID by TFmotif NO: 118 down NdeI restriction siteGTTTAATTGAGTTGTCATATGTTAATAACGGTAT SEQ ID NO: 119 downNdeI restriction site ATACCGTTATTAACATATGACAACTCAATTAAAC SEQ ID NO: 120up (auxin 3 × DR5 auxin- CCGACAAAAGGCCGACAAAAGGCCGACAAAAGGT SEQ IDresponsive) response element NO: 121 up (auxin 3 × DR5 auxin-ACCTTTTGTCGGCCTTTTGTCGGCCTTTTGTCGG SEQ ID responsive) response elementNO: 122 up (auxin 6 × DR5 auxin-GCCGACAAAAGGCCGACAAAAGGCCGACAAAAGGCCGACAAAAGGCCGACA SEQ ID responsive)responsive element AAAGGCCGACAAAAGGT NO: 123 up (auxin 6 × DR5 auxin-ACCTTTTGTCGGCCTTTTGTCGGCCTTTTGTCGGCCTTTTGTCGGCCTTTT SEQ ID responsive)responsive element GTCGGCCTTTTGTCGGC NO: 124 up (auxin 9 × DR5 auxin-CCGACAAAAGGCCGACAAAAGGCCGACAAAAGGCCGACAAAAGGCCGACAA SEQ ID responsive)responsive element AAGGCCGACAAAAGGCCGACAAAAGGCCGACAAAAGGCCGACAAAAGGTNO: 125 up (auxin 9 × DR5 auxin-ACCTTTTGTCGGCCTTTTGTCGGCCTTTTGTCGGCCTTTTGTCGGCCTTTT SEQ ID responsive)responsive element GTCGGCCTTTTGTCGGCCTTTTGTCGGCCTTTTGTCGGCCTTTTGTCGGNO: 126 Cre recombinase LoxP (wild-type 1)ATAACTTCGTATAGCATACATTATACGAAGTTAT SEQ ID recognition site NO: 127Cre recombinase LoxP (wild-type 2) ATAACTTCGTATAATGTATGCTATACGAAGTTATSEQ ID recognition site NO: 128 Cre recombinase Canonical LoxPATAACTTCGTATANNNTANNNTATACGAAGTTAT SEQ ID recognition site NO: 129Cre recombinase Lox 511 ATAACTTCGTATAATGTATaCTATACGAAGTTAT SEQ IDrecognition site NO: 130 Cre recombinase Lox 5171ATAACTTCGTATAATGTgTaCTATACGAAGTTAT SEQ ID recognition site NO: 131Cre recombinase Lox 2272 ATAACTTCGTATAAaGTATcCTATACGAAGTTAT SEQ IDrecognition site NO: 132 Cre recombinase M2ATAACTTCGTATAAgaaAccaTATACGAAGTTAT SEQ ID recognition site NO: 133Cre recombinase M3 ATAACTTCGTATAtaaTACCATATACGAAGTTAT SEQ IDrecognition site NO: 134 Cre recombinase M7ATAACTTCGTATAAgaTAGAATATACGAAGTTAT SEQ ID recognition site NO: 135Cre recombinase M11 ATAACTTCGTATAaGATAgaaTATACGAAGTTAT SEQ IDrecognition site NO: 136 Cre recombinase Lox 71taccgTTCGTATANNNTANNNTATACGAAGTTAT SEQ ID recognition site NO: 137Cre recombinase Lox 66 ATAACTTCGTATANNNTANNNTATACGAAcggta SEQ IDrecognition site NO: 138 maize ovule/early miR156j recognitionGTGCTCTCTCTCTTCTGTCA SEQ ID kernel transcript site NO: 139down-regulation maize ovule/early miR156j recognitionCTGCTCTCTCTCTTCTGTCA SEQ ID kernel transcript site NO: 140down-regulation maize ovule/early miR156j recognitionTTGCTTACTCTCTTCTGTCA SEQ ID kernel transcript site NO: 141down-regulation maize ovule/early miR156j recognitionCCGCTCTCTCTCTTCTGTCA SEQ ID kernel transcript site NO: 142down-regulation maize ovule/early miR159c recognitionTGGAGCTCCCTTCATTCCAAT SEQ ID kernel transcript site NO: 143down-regulation maize ovule/early miR159c recognitionTCGAGTTCCCTTCATTCCAAT SEQ ID kernel transcript site NO: 144down-regulation maize ovule/early miR159c recognitionATGAGCTCTCTTCAAACCAAA SEQ ID kernel transcript site NO: 145down-regulation maize ovule/early miR159c recognitionTGGAGCTCCCTTCATTCCAAG SEQ ID kernel transcript site NO: 146down-regulation maize ovule/early miR159c recognitionTAGAGCTTCCTTCAAACCAAA SEQ ID kernel transcript site NO: 147down-regulation maize ovule/early miR159c recognitionTGGAGCTCCATTCGATCCAAA SEQ ID kernel transcript site NO: 148down-regulation maize ovule/early miR159c recognitionAGCAGCTCCCTTCAAACCAAA SEQ ID kernel transcript site NO: 149down-regulation maize ovule/early miR159c recognitionCAGAGCTCCCTTCACTCCAAT SEQ ID kernel transcript site NO: 150down-regulation maize ovule/early miR159c recognitionTGGAGCTCCCTTCACTCCAAT SEQ ID kernel transcript site NO: 151down-regulation maize ovule/early miR159c recognitionTGGAGCTCCCTTCACTCCAAG SEQ ID kernel transcript site NO: 152down-regulation maize ovule/early miR159c recognitionTGGAGCTCCCTTTAATCCAAT SEQ ID kernel transcript site NO: 153down-regulation maize embryo miR166b recognition TTGGGATGAAGCCTGGTCCGGSEQ ID transcript down- site NO: 154 regulation maize embryomiR166b recognition CTGGGATGAAGCCTGGTCCGG SEQ ID transcript down- siteNO: 155 regulation maize embryo miR166b recognitionCTGGAATGAAGCCTGGTCCGG SEQ ID transcript down- site NO: 156 regulationmaize embryo miR166b recognition CGGGATGAAGCCTGGTCCGG SEQ IDtranscript down- site NO: 157 regulation maize endospermmiR167g recognition GAGATCAGGCTGGCAGCTTGT SEQ ID transcript down- siteNO: 158 regulation maize endosperm miR167g recognitionTAGATCAGGCTGGCAGCTTGT SEQ ID transcript down- site NO: 159 regulationmaize endosperm miR167g recognition AAGATCAGGCTGGCAGCTTGT SEQ IDtranscript down- site NO: 160 regulation maize pollenmiR156i recognition GTGCTCTCTCTCTTCTGTCA SEQ ID transcript down- siteNO: 161 regulation maize pollen miR156i recognition CTGCTCTCTCTCTTCTGTCASEQ ID transcript down- site NO: 162 regulation maize pollenmiR156i recognition TTGCTTACTCTCTTCTGTCA SEQ ID transcript down- siteNO: 163 regulation maize pollen miR156i recognition CCGCTCTCTCTCTTCTGTCASEQ ID transcript down- site NO: 164 regulation maize pollenmir160b-like TGGCATGCAGGGAGCCAGGCA SEQ ID transcript down-recognition site NO: 165 regulation maize pollen mir160b-likeAGGAATACAGGGAGCCAGGCA SEQ ID transcript down- recognition site NO: 166regulation maize pollen mir160b-like GGGTTTACAGGGAGCCAGGCA SEQ IDtranscript down- recognition site NO: 167 regulation maize pollenmir160b-like AGGCATACAGGGAGCCAGGCA SEQ ID transcript down-recognition site NO: 168 regulation maize pollen miR393a recognitionAAACAATGCGATCCCTTTGGA SEQ ID transcript down- site NO: 169 regulationmaize pollen miR393a recognition AGACCATGCGATCCCTTTGGA SEQ IDtranscript down- site NO: 170 regulation maize pollenmiR393a recognition GGTCAGAGCGATCCCTTTGGC SEQ ID transcript down- siteNO: 171 regulation maize pollen miR393a recognitionAGACAATGCGATCCCTTTGGA SEQ ID transcript down- site NO: 172 regulationmaize pollen miR396a recognition TCGTTCAAGAAAGCCTGTGGAA SEQ IDtranscript down- site NO: 173 regulation maize pollenmiR396a recognition CGTTCAAGAAAGCCTGTGGAA SEQ ID transcript down- siteNO: 174 regulation maize pollen miR396a recognitionTCGTTCAAGAAAGCATGTGGAA SEQ ID transcript down- site NO: 175 regulationmaize pollen miR396a recognition ACGTTCAAGAAAGCTTGTGGAA SEQ IDtranscript down- site NO: 176 regulation maize pollenmiR396a recognition CGTTCAAGAAAGCCTGTGGAA SEQ ID transcript down- siteNO: 177 regulation maize male tissue male tissue-specificGGACAACAAGCACCTTCTTGCCTTGCAAGGCCTCCCTTCCCTATGGTAGCCACT SEQ IDtranscript down- siRNA elementTGAGTGGATGACTTCACCTTAAAGCTATTGATTCCCTAAGTGCCAGACATAATA NO: 178regulation GGCTATACATTCTCTCTGGTGGCAACAATGAGCCATTTTGGTTGGTGTGGTAGTCTATTATTGAGTTTTTTTTGGCACCGTACTCCCATGGAGAGTAGAAGACAAACTCTTCACCGTTGTAGTCGTTGATGGTATTGGTGGTGACGACATCCTTGGTGTGCATGCACTGGTGAGTCACTGTTGTACTCGGCG maize male tissue male tissue-specificGGACAACAAGCACCTTCTTGCCTTGCAAGGCCTCCCTTCCCTATGGTAGCCACT SEQ IDtranscript down- siRNA elementTGAGTGGATGACTTCACCTTAAAGCTATCGATTCCCTAAGTGCCAGACAT NO: 179 regulationmaize male tissue male tissue-specificCTCTTCACCGTTGTAGTCGTTGATGGTATTGGTGGTGACGACATCCTTGGTGTG SEQ IDtranscript down- siRNA element CATGCACTGGTGAGTCACTGTTGTAC NO: 180regulation maize male tissue male tissue-specificGGACAACAAGCACCTTCTTGCCTTGCAAGGCCTCCCTTCCCTATGGTAGCCACT SEQ IDtranscript down- siRNA elementTGAGTGGATGACTTCACCTTAAAGCTATCGATTCCCTAAGTGCCAGACATCTCT NO: 181regulation TCACCGTTGTAGTCGTTGATGGTATTGGTGGTGACGACATCCTTGGTGTGCATGCACTGGTGAGTCACTGTTGTAC maize male tissue male tissue-specificCTCTTCACCGTTGTAGTCGTTGATGGTATTGGTGGTGACGACATCCTTGGTGTG SEQ IDtranscript down- siRNA elementCATGCACTGGTGAGTCACTGTTGTACGGACAACAAGCACCTTCTTGCCTTGCAA NO: 182regulation GGCCTCCCTTCCCTATGGTAGCCACTTGAGTGGATGACTTCACCTTAAAGCTATCGATTCCCTAAGTGCCAGACAT up (auxin ocs enhancer ACGTAAGCGCTTACGT SEQ IDreponsive; (Agrobacterium sp.) NO: 183 constitutive) up (auxin12-nt ocs orthologue GTAAGCGCTTAC SEQ ID reponsive; (Zea mays) NO: 184constitutive) up (nitrogen AtNREAAGAGATGAGCTCTTGAGCAATGTAAAGGGTCAAGTTGTTTCT SEQ ID responsive) NO: 185up (nitrogen AtNRE AGAAACAACTTGACCCTTTACATTGCTCAAGAGCTCATCTCTT SEQ IDresponsive) NO: 186 up (auxin 3 × DR5 auxin-ACCUUUUGUCGGCCUUUUGUCGGCCUUUUGUCGG SEQ ID responsive) response element;NO: 187 RNA strand of RNA/DNA hybid up (auxin 3 × DR5 auxin-TCGGTCCGACAAAAGGCCGACAAAAGGCGGACAAAAGG SEQ ID responsive)response element; NO: 188 sticky-ended up (auxin 3 × DR5 auxin-ACCGACCTTTTGTCGGCCTTTTGTCGGCCTTTTGTCGG SEQ ID responsive)response element; NO: 189 sticky-ended down or up OsTBF1 uORF2ATGGGAGTAGAGGCGGGCGGCGGCTGCGGTGGGAGGGCGGTAGTCACCGGAT SEQ ID (MAMP-TCTACGTCTGGGGCTGGGAGTTCCTCACCGCCCTCCTGCTCTTCTCGGCCACCA NO: 190responsive) CCTCCTACTAG down or up synthetic R-motif AAAAAAAAAAAAAAASEQ ID (MAMP- NO: 191 responsive) down or up AtTBF1 R-motifCACATACACACAAAAATAAAAAAGA SEQ ID (MAMP- NO: 192 responsive) decreasesinsulator GAATATATATATATTC SEQ ID upregulation NO: 193 sequenceZmEPSPS exon 1 GTGAACAACCTTATGAAATTTGGGCGCATAACTTCGTATAGCATACATTATACGSEQ ID modification with two pointAAGTTATAAAGAACTCGCCCTCAAGGGTTGATCTTATGCCATCGTCATGATAA NO: 194mutations and ACAGTGGAGCACGGACGATCCTTTACGTTGTTTTTAACAAACTTTGTCAGAAAAheterospecific lox CTAGCATCATTAACTTCTTAATGACGATTTCACAACAAAAAAAGGTAACCTCGsites CTACTAACATAACAAAATACTTGTTGCTTATTAATTATATGTTTTTTAATCTTTGATCAGGGGACAACAGTGGTTGATAACCTGTTGAACAGTGAGGATGTCCACTACATGCTCGGGGCCTTGAGGACTCTTGGTCTCTCTGTCGAAGCGGACAAAGCTGCCAAAAGAGCTGTAGTTGTTGGCTGTGGTGGAAAGTTCCCAGTTGAGGATTCTAAAGAGGAAGTGCAGCTCTTCTTGGGGAATGCTGGAATTGCAATGCGGGCATTGACAGCAGCTGTTACTGCTGCTGGTGGAAATGCAACGTATGTTTCCTCTCTTTCTCTCTACAATACTTGCATAACTTCGTATAAAGTATCCTATACGAAGTTATTGGAGTTAGTATGAAACCCATGGGTATGTCTAGT decreases miniature inverted-TACTCCCTCCGTTTCTTTTTATTAGTCGCTGGATAGTGCAATTTTGCACTATC SEQ IDupregulation repeat transposable CAGCGACTAATAAAAAGAAACGGAGGGAGTA NO: 195element (“MITE”) decreases miniature inverted-TACTCCCTCCGTTTCTTTTTATTAGTCGCTGGATAGTGCAAAATTGCACTAT SEQ ID upregulationrepeat transposable CCAGCGACTAATAAAAAGAAACGGAGGGAGTA NO: 196element (“MITE”) up (constitutive) G-box ACACGTGACACGTGACACGTGACACGTGSEQ ID NO: 197 decreased mRNA destabilizingTTATTTATTTTATTTATTTTATTTATTTTATTTATT SEQ ID transcript element NO: 198stability (mammalian) decreased mRNA destabilizingAATTTTAATTTTAATTTTAATTTTAATTTTAATTTT SEQ ID transcriptelement (Arabidopsis NO: 199 stability thaliana) increasedmRNA stabilizing TCTCTTTCTCTTTCTCTTTCTCTTTCTCTTTCTCTT SEQ ID transcriptelement NO: 200 stability down SHAT1-repressorATTAAAAAAATAAATAAGATATTATTAAAAAAATAAATAAGATATTATTAAAA SEQ IDAAATAAATAAGATATTATTAAAAAAATAAATAAGATATT NO: 201 decreased SAUR mRNAAGATCTAGGAGACTGACATAGATTGGAGGAGACATTTTGTATAATAAGATCTA SEQ ID transcriptdestabilizing GGAGACTGACATAGATTGGAGGAGACATTTTGTATAATA NO: 202 stabilityelement down by recruiting CTCC CTCC(T/A/G)CC(G/T/A) SEQ IDtranscription NO: 203 factors interacting with PRC2 down by recruitingCCG (C/T/A)(G/T)C(C/A)(G/A)(C/A)C(G/T)(C/A) SEQ ID transcription NO: 204factors interacting with PRC2 down by recruiting G-box(C/G)ACGTGGNN(G/A/C)(T/A) SEQ ID transcription NO: 205factors interacting with PRC2 down by recruiting GA repeatA(G/A)A(G/A)AGA(G/A)(A/G) SEQ ID transcription NO: 206factors interacting with PRC2 down by recruiting AC-richCA(A/T/C)CA(C/A)CA(A/C/T) SEQ ID transcription NO: 207factors interacting with PRC2 down by recruiting Telobox(A/G)AACCC(T/A)A(A/G) SEQ ID transcription NO: 208 factors interactingwith PRC2 up (Pi starvation PIBS GNATATNC SEQ ID response) NO: 209 UpOCS GTAAGCGCTTAC SEQ ID NO: 210 Up GboxGCCACGTGCCGCCACGTGCCGCCACGTGCCGCCACGTGCC SEQ ID NO: 211 Up Green tissue-AAAATATTTATAAAATATTTATAAAATATTTATAAAATATTTAT SEQ ID specific NO: 212promoter (GSP) Up E2F binding site CCCGCCAAACCCGCCAAACCCGCCAAACCCGCCAAASEQ ID NO: 213 Down mRNA destablizingAATTTTAATTTTAATTTTAATTTTAATTTTAATTTT SEQ ID element NO: 214 DownSilencer GAATATATATATATTC SEQ ID NO: 215

TABLE 10 Gene Expression Gene Protein Trait Gene(s) Gene ID ProductChange SEQ ID SEQ ID Abiotic stress DREB1-like 547622 Protein IncreaseSEQ ID NO: 235 SEQ ID NO: 456 Abiotic stress DREB1 547642 ProteinIncrease SEQ ID NO: 236 SEQ ID NO: 457 Abiotic stress NRP-A 547685Protein Increase SEQ ID NO: 237 SEQ ID NO: 458 Abiotic stress PAP3547708 Protein Increase SEQ ID NO: 238 SEQ ID NO: 459 Abiotic stressVSP25 547821 Protein Increase SEQ ID NO: 239 SEQ ID NO: 460 Abioticstress MP2 547827 Protein Increase SEQ ID NO: 240 SEQ ID NO: 461 Abioticstress BIP 547839 Protein Increase SEQ ID NO: 241 SEQ ID NO: 462 Abioticstress PPCK3 548089 Protein Increase SEQ ID NO: 242 SEQ ID NO: 463Abiotic stress IFS2 606705 Protein Increase SEQ ID NO: 243 SEQ ID NO:464 Abiotic Stress NAC81 732555 Protein Decrease SEQ ID NO: 244 SEQ IDNO: 465 Abiotic stress DREB2 732579 Protein Increase SEQ ID NO: 245 SEQID NO: 466 Abiotic stress LOC732608 732608 Protein Increase SEQ ID NO:246 SEQ ID NO: 467 Abiotic stress LOC732656 732656 Protein Increase SEQID NO: 247 SEQ ID NO: 468 Abiotic stress LOC778160 778160 ProteinIncrease SEQ ID NO: 248 SEQ ID NO: 469 Abiotic stress BZIP132 778192Protein Increase SEQ ID NO: 249 SEQ ID NO: 470 Abiotic stress GMWRKY46100127375 Protein Increase SEQ ID NO: 250 SEQ ID NO: 471 Abiotic stressOSBP 100137077 Protein Increase SEQ ID NO: 251 SEQ ID NO: 472 Abioticstress GT-2B 100137081 Protein Increase SEQ ID NO: 252 SEQ ID NO: 473Abiotic stress NAC19 100170713 Protein Increase SEQ ID NO: 253 SEQ IDNO: 474 Abiotic stress LOC100170723 100170723 Protein Increase SEQ IDNO: 254 SEQ ID NO: 475 Abiotic stress GMRD22 100301893 Protein IncreaseSEQ ID NO: 255 SEQ ID NO: 476 Abiotic stress GSTU4 100527381 ProteinIncrease SEQ ID NO: 256 SEQ ID NO: 477 Abiotic stress LOC100776453100776453 Protein Increase SEQ ID NO: 257 SEQ ID NO: 478 Abiotic stressLOC100779440 100779440 Protein Increase SEQ ID NO: 258 SEQ ID NO: 479Abiotic stress GSK-3 100780226 Protein Increase SEQ ID NO: 259 SEQ IDNO: 480 Abiotic stress GMPIP1-6 100780356 Protein Increase SEQ ID NO:260 SEQ ID NO: 481 Abiotic stress LOC100782841 100782841 ProteinIncrease SEQ ID NO: 261 SEQ ID NO: 482 Abiotic stress LOC100782841100782841 Protein Increase SEQ ID NO: 262 SEQ ID NO: 483 Abiotic stressGMSIZ1B 100793735 Protein Increase SEQ ID NO: 263 SEQ ID NO: 484 Abioticstress LOC100794096 100794096 Protein Increase SEQ ID NO: 264 SEQ ID NO:485 Abiotic stress GMPUB8 100795263 Protein Decrease SEQ ID NO: 265 SEQID NO: 486 Abiotic Stress GmRCAb 100797222 Protein Change codingsequence SEQ ID NO: 266 SEQ ID NO: 487 Abiotic stress GMSIZ1A 100797252Protein Increase SEQ ID NO: 267 SEQ ID NO: 488 Abiotic stress AP2-7100800134 Protein Increase SEQ ID NO: 268 SEQ ID NO: 489 Abiotic stressLOC100800453 100800453 Protein Increase SEQ ID NO: 269 SEQ ID NO: 490Abiotic stress RFP1 100806337 Protein Increase SEQ ID NO: 270 SEQ ID NO:491 Abiotic Stress GmPYL9 100810273 Protein Change coding sequence SEQID NO: 271 SEQ ID NO: 492 Abiotic stress LOC100812768 100812768 ProteinIncrease SEQ ID NO: 272 SEQ ID NO: 493 Abiotic stress LOC100819467100819467 Protein Increase SEQ ID NO: 273 SEQ ID NO: 494 Abiotic stressGMHSF-34 100820298 Protein Increase SEQ ID NO: 274 SEQ ID NO: 495Abiotic stress MIR172C 100886233 miRNA Increase SEQ ID NO: 275Architecture W1 547705 Protein Increase/Decrease SEQ ID NO: 276 SEQ IDNO: 497 Architecture ENOD55-2 547770 Protein Increase SEQ ID NO: 277 SEQID NO: 498 Architecture ENOD93 547773 Protein Increase SEQ ID NO: 278SEQ ID NO: 499 Architecture CLV1B 732625 Protein Decrease SEQ ID NO: 279SEQ ID NO: 500 Architecture AKR1 100301897 Protein Decrease SEQ ID NO:280 SEQ ID NO: 501 Architecture GMMFT 100306314 ProteinIncrease/Decrease SEQ ID NO: 281 SEQ ID NO: 502 ArchitectureLOC100499629 100499629 Protein Increase/Decrease SEQ ID NO: 282 SEQ IDNO: 503 Architecture LOC100775555 100775555 Protein Increase SEQ ID NO:283 SEQ ID NO: 504 Architecture Dt1 100776154 Protein Increase/DecreaseSEQ ID NO: 284 SEQ ID NO: 505 Architecture GMFT3B 100781509 ProteinIncrease/Decrease SEQ ID NO: 285 SEQ ID NO: 506 Architecture GS52100781628 Protein Increase SEQ ID NO: 286 SEQ ID NO: 507 Architecture W2100782308 Protein Increase/Decrease SEQ ID NO: 287 SEQ ID NO: 508Architecture LOC100787444 100787444 Protein Decrease SEQ ID NO: 288 SEQID NO: 509 Architecture LOC100790763 100790763 Protein Increase/DecreaseSEQ ID NO: 289 SEQ ID NO: 510 Architecture TFL1.1 100791809 ProteinDecrease SEQ ID NO: 290 SEQ ID NO: 511 Architecture GMFT5A 100796994Protein Increase/Decrease SEQ ID NO: 291 SEQ ID NO: 512 ArchitectureGMFT3A 100803909 Protein Increase/Decrease SEQ ID NO: 292 SEQ ID NO: 513Architecture LOC100813937 100813937 Protein Decrease SEQ ID NO: 293 SEQID NO: 514 Architecture GMFT2A 100814951 Protein Increase/Decrease SEQID NO: 294 SEQ ID NO: 515 Architecture LOC100818062 100818062 ProteinDecrease SEQ ID NO: 295 SEQ ID NO: 516 Architecture LN 102661548 ProteinIncrease/Decrease SEQ ID NO: 296 SEQ ID NO: 517 Architecture CLV3a102662349 Protein Decrease SEQ ID NO: 297 SEQ ID NO: 518 ArchitectureLOC102664687 102664687 Protein Increase/Decrease SEQ ID NO: 298 SEQ IDNO: 519 Architecture CLV3b 102669448 Protein Decrease SEQ ID NO: 299 SEQID NO: 520 Biotic stress PDR12 547508 Protein Increase SEQ ID NO: 300SEQ ID NO: 521 Biotic stress DREB1 547642 Protein Increase SEQ ID NO:301 SEQ ID NO: 522 Biotic stress CYSTATIN 547777 Protein Increase SEQ IDNO: 302 SEQ ID NO: 523 Biotic stress PRP 547791 Protein Increase SEQ IDNO: 303 SEQ ID NO: 524 Biotic stress ACPD 547808 Protein Decrease SEQ IDNO: 304 SEQ ID NO: 525 Biotic stress PGIP 547838 Protein Increase SEQ IDNO: 305 SEQ ID NO: 526 Biotic stress L1 547875 Protein Increase SEQ IDNO: 306 SEQ ID NO: 527 Biotic stress LOC100818432 547983 ProteinIncrease SEQ ID NO: 307 SEQ ID NO: 528 Biotic stress MIPS 548084 ProteinIncrease SEQ ID NO: 308 SEQ ID NO: 529 Biotic stress LOC100815291 732578Protein Increase SEQ ID NO: 309 SEQ ID NO: 530 Biotic stressLOC100797842 3989271 Protein Increase SEQ ID NO: 310 Biotic stress N-36A3989355 Protein Decrease SEQ ID NO: 311 Biotic stress N2 15308528Protein Increase SEQ ID NO: 312 Biotic stress LOC100797449 15308540Protein Increase SEQ ID NO: 313 Biotic stress RHG1 100101892 ProteinIncrease SEQ ID NO: 314 SEQ ID NO: 535 Biotic stress G4DT 100301896Protein Increase SEQ ID NO: 315 SEQ ID NO: 536 Biotic stress PGIP4100305373 Protein Increase SEQ ID NO: 316 SEQ ID NO: 537 Biotic stressRLK3 100499747 Protein Increase SEQ ID NO: 317 SEQ ID NO: 538 Bioticstress rps1 100500504 Protein Increase SEQ ID NO: 318 SEQ ID NO: 539Biotic stress LOC100811309 100787186 Protein Decrease SEQ ID NO: 319 SEQID NO: 540 Biotic stress LOC100794096 100794096 Protein Increase SEQ IDNO: 320 SEQ ID NO: 541 Biotic stress LOC100795239 100795239 ProteinIncrease SEQ ID NO: 321 SEQ ID NO: 542 Biotic stress RLK 100795799Protein Increase SEQ ID NO: 322 SEQ ID NO: 543 Biotic stress VLXB100797716 Protein Increase SEQ ID NO: 323 SEQ ID NO: 544 Biotic stressLOC547834 100797843 Protein Increase SEQ ID NO: 324 SEQ ID NO: 545Biotic stress NES 100799695 Protein Increase SEQ ID NO: 325 SEQ ID NO:546 Biotic stress LOC547704 100803679 Protein Increase SEQ ID NO: 326SEQ ID NO: 547 Biotic stress HSP70 100816111 Protein Decrease SEQ ID NO:327 SEQ ID NO: 548 Flowering Time SOYAP1 547478 ProteinIncrease/Decrease SEQ ID NO: 328 SEQ ID NO: 549 Flowering Time PHYA547810 Protein Increase/Decrease SEQ ID NO: 329 SEQ ID NO: 550 FloweringTime LOC100037477 100037477 Protein Increase/Decrease SEQ ID NO: 330 SEQID NO: 551 Flowering Time CRY1a 100233233 Protein Increase/Decrease SEQID NO: 331 SEQ ID NO: 552 Flowering Time TOC1 100271889 ProteinIncrease/Decrease SEQ ID NO: 332 SEQ ID NO: 553 Flowering Time COL2a100301885 Protein Increase/Decrease SEQ ID NO: 333 SEQ ID NO: 554Flowering Time FKF1 100301889 Protein Increase/Decrease SEQ ID NO: 334SEQ ID NO: 555 Flowering Time AGL11 100301905 Protein Increase/DecreaseSEQ ID NO: 335 SEQ ID NO: 556 Flowering Time GMMFT 100306314 ProteinIncrease/Decrease SEQ ID NO: 336 SEQ ID NO: 557 Flowering Time GIGANTEA100779044 Protein Increase/Decrease SEQ ID NO: 337 SEQ ID NO: 558Flowering Time GMFT3B 100781509 Protein Increase/Decrease SEQ ID NO: 338SEQ ID NO: 559 Flowering Time FLD 100786453 Protein Increase/DecreaseSEQ ID NO: 339 SEQ ID NO: 560 Flowering Time ELF3A 100793561 ProteinIncrease/Decrease SEQ ID NO: 340 SEQ ID NO: 561 Flowering Time CIB1100794256 Protein Increase/Decrease SEQ ID NO: 341 SEQ ID NO: 562Flowering Time GMFT5A 100796994 Protein Increase/Decrease SEQ ID NO: 342SEQ ID NO: 563 Flowering Time LOC100799720 100799720 ProteinIncrease/Decrease SEQ ID NO: 343 SEQ ID NO: 564 Flowering Time PHYB100799831 Protein Increase/Decrease SEQ ID NO: 344 SEQ ID NO: 565Flowering Time GMGIA 100800578 Protein Increase/Decrease SEQ ID NO: 345SEQ ID NO: 566 Flowering Time LOC100801792 100801792 ProteinIncrease/Decrease SEQ ID NO: 346 SEQ ID NO: 567 Flowering Time GMFT3A100803909 Protein Increase/Decrease SEQ ID NO: 347 SEQ ID NO: 568Flowering Time LOC100810415 100810415 Protein Increase/Decrease SEQ IDNO: 348 SEQ ID NO: 569 Flowering Time GMFT2A 100814951 ProteinIncrease/Decrease SEQ ID NO: 349 SEQ ID NO: 570 Flowering TimeLOC102666452 102666452 Protein Increase/Decrease SEQ ID NO: 350 SEQ IDNO: 571 Flowering Time LOC102667341 102667341 Protein Increase/DecreaseSEQ ID NO: 351 SEQ ID NO: 572 Flowering Time LOC102670334 102670334Protein Increase/Decrease SEQ ID NO: 352 SEQ ID NO: 573 Nutrient useefficiency MIPS 547604 Protein Decrease SEQ ID NO: 353 SEQ ID NO: 574Nutrient use efficiency DMT1 547711 Protein Increase SEQ ID NO: 354 SEQID NO: 575 Nutrient use efficiency LOX1.2 547774 Protein Decrease SEQ IDNO: 355 SEQ ID NO: 576 Nutrient use efficiency ACPD 547808 ProteinDecrease SEQ ID NO: 356 SEQ ID NO: 577 Nutrient use efficiency LOX1.3547869 Protein Knockout/Decrease SEQ ID NO: 357 SEQ ID NO: 578 Nutrientuse efficiency AS2 547894 Protein Increase SEQ ID NO: 358 SEQ ID NO: 579Nutrient use efficiency AS1 547895 Protein Increase SEQ ID NO: 359 SEQID NO: 580 Nutrient use efficiency LOX1.1 547923 ProteinKnockout/Decrease SEQ ID NO: 360 SEQ ID NO: 581 Nutrient use efficiencyLOC547940 547940 Protein Increase SEQ ID NO: 361 SEQ ID NO: 582 Nutrientuse efficiency NRT2 547946 Protein Increase SEQ ID NO: 362 SEQ ID NO:583 Nutrient use efficiency GS 548082 Protein Increase SEQ ID NO: 363SEQ ID NO: 584 Nutrient use efficiency IPK1 100127406 ProteinKnockout/Decrease SEQ ID NO: 364 SEQ ID NO: 585 Nutrient use efficiencySAD2 100217331 Protein Decrease SEQ ID NO: 365 SEQ ID NO: 586 Nutrientuse efficiency LOC100527257 100527257 Protein Increase SEQ ID NO: 366SEQ ID NO: 587 Nutrient use efficiency LOC100775983 100775983 ProteinIncrease SEQ ID NO: 367 SEQ ID NO: 588 Nutrient use efficiency PHR1100783132 Protein Increase SEQ ID NO: 368 SEQ ID NO: 589 Nutrient useefficiency GMPT5 100786638 Protein Increase SEQ ID NO: 369 SEQ ID NO:590 Nutrient use efficiency PAP4 100790529 Protein Increase SEQ ID NO:370 SEQ ID NO: 591 Photosyntheiss VDE 100778118 Protein Increase SEQ IDNO: 371 SEQ ID NO: 592 Photosyntheiss PSBS 100779417 Protein IncreaseSEQ ID NO: 372 SEQ ID NO: 593 Photosyntheiss ZEP 100800186 ProteinIncrease SEQ ID NO: 373 SEQ ID NO: 594 Photosyntheiss PSBS 100807355Protein Increase SEQ ID NO: 374 SEQ ID NO: 595 Photosyntheiss VDE100816085 Protein Increase SEQ ID NO: 375 SEQ ID NO: 596 PhotosyntheissZEP 100820171 Protein Increase SEQ ID NO: 376 SEQ ID NO: 597Photosynthesis DREB1 547642 Protein Increase SEQ ID NO: 377 SEQ ID NO:598 Photosynthesis VPE 547964 Protein Increase SEQ ID NO: 378 SEQ ID NO:599 Photosynthesis PIP 100811119 Protein Increase SEQ ID NO: 379 SEQ IDNO: 600 Resource partitioning AO 547647 Protein Increase SEQ ID NO: 380SEQ ID NO: 601 Resource partitioning ACPD 547808 Protein Decrease SEQ IDNO: 381 SEQ ID NO: 602 Resource partitioning FBP 547809 Protein IncreaseSEQ ID NO: 382 SEQ ID NO: 603 Resource partitioning AS1 547895 ProteinIncrease SEQ ID NO: 383 SEQ ID NO: 604 Resource partitioning REDUCTASE547911 Protein Increase SEQ ID NO: 384 SEQ ID NO: 605 Resourcepartitioning LOC547940 547940 Protein Increase SEQ ID NO: 385 SEQ ID NO:606 Resource partitioning DGAT1C 547982 Protein Increase/Decrease SEQ IDNO: 386 SEQ ID NO: 607 Resource partitioning DGAT1C 547982 ProteinIncrease/Decrease SEQ ID NO: 387 SEQ ID NO: 608 Resource partitioningDGAT1A 548005 Protein Increase/Decrease SEQ ID NO: 388 SEQ ID NO: 609Resource partitioning DGAT1A 548005 Protein Increase/Decrease SEQ ID NO:389 SEQ ID NO: 610 Resource partitioning GOLS 548050 Protein DecreaseSEQ ID NO: 390 SEQ ID NO: 611 Resource partitioning BBI 548083 ProteinDecrease SEQ ID NO: 391 SEQ ID NO: 612 Resource partitioning DGAT1B732606 Protein Increase/Decrease SEQ ID NO: 392 SEQ ID NO: 613 Resourcepartitioning Dof4 778097 Protein Increase SEQ ID NO: 393 SEQ ID NO: 614Resource partitioning MYB73 778179 Protein Increase/Decrease SEQ ID NO:394 SEQ ID NO: 615 Resource partitioning IFS1 100037450 Protein IncreaseSEQ ID NO: 395 SEQ ID NO: 616 Resource partitioning SACPD-C 100037478Protein Increase/Decrease SEQ ID NO: 396 SEQ ID NO: 617 Resourcepartitioning FAD3 100038323 Protein Increase/Decrease SEQ ID NO: 618Resource partitioning AAH1 100137075 Protein Increase SEQ ID NO: 398 SEQID NO: 619 Resource partitioning LOC100194415 100194415 Protein IncreaseSEQ ID NO: 399 SEQ ID NO: 620 Resource partitioning COL2a 100301885Protein Regulate for Flowering SEQ ID NO: 400 SEQ ID NO: 621 Resourcepartitioning CYP73A11 100499623 Protein Increase SEQ ID NO: 401 SEQ IDNO: 622 Resource partitioning LOC100499629 100499629 Protein DecreaseSEQ ID NO: 402 SEQ ID NO: 623 Resource partitioning LOC100775672100775672 Protein Increase SEQ ID NO: 403 SEQ ID NO: 624 Resourcepartitioning LOC100775983 100775983 Protein Increase SEQ ID NO: 404 SEQID NO: 625 Resource partitioning PHYTASE 100778145 Protein Increase SEQID NO: 405 SEQ ID NO: 626 Resource partitioning LOC100783693 100783693Protein Increase SEQ ID NO: 406 SEQ ID NO: 627 Resource partitioningLOC100788179 100788179 Protein Increase SEQ ID NO: 407 SEQ ID NO: 628Resource partitioning TFL1.1 100791809 Protein Regulate for FloweringSEQ ID NO: 408 SEQ ID NO: 629 Resource partitioning LOC100797018100797018 Protein Increase SEQ ID NO: 409 SEQ ID NO: 630 Resourcepartitioning LOC100799931 100799931 Protein Increase SEQ ID NO: 410 SEQID NO: 631 Resource partitioning LOC100800931 100800931 Protein IncreaseSEQ ID NO: 411 SEQ ID NO: 632 Resource partitioning LOC100803398100803398 Protein Increase SEQ ID NO: 412 SEQ ID NO: 633 Resourcepartitioning FAD2 100805777 Protein Increase/Decrease SEQ ID NO: 413 SEQID NO: 634 Resource partitioning LOC100807749 100807749 Protein IncreaseSEQ ID NO: 414 SEQ ID NO: 635 Resource partitioning LOC100809706100809706 Protein Increase SEQ ID NO: 415 SEQ ID NO: 636 Resourcepartitioning LOC100813937 100813937 Protein Decrease SEQ ID NO: 416 SEQID NO: 637 Resource partitioning LOC100814531 100814531 Protein IncreaseSEQ ID NO: 417 SEQ ID NO: 638 Resource partitioning GMFT2A 100814951Protein Increase/Decrease SEQ ID NO: 418 SEQ ID NO: 639 Resourcepartitioning LOC102666452 102666452 Protein Regulate for Flowering SEQID NO: 419 SEQ ID NO: 640 Senescence GST 547580 Protein Decrease SEQ IDNO: 420 SEQ ID NO: 641 Senescence NRP-A 547685 Protein Increase/DecreaseSEQ ID NO: 421 SEQ ID NO: 642 Senescence PGIP 547838 ProteinIncrease/Decrease SEQ ID NO: 422 SEQ ID NO: 643 Senescence VPE 547964Protein Decrease SEQ ID NO: 423 SEQ ID NO: 644 Senescence NAC2 732553Protein Decrease SEQ ID NO: 424 SEQ ID NO: 645 Senescence SGR1 732647Protein Increase SEQ ID NO: 425 SEQ ID NO: 646 Senescence petB 3989327Protein Increase SEQ ID NO: 426 Senescence SAN1A 100101864 ProteinDecrease SEQ ID NO: 427 SEQ ID NO: 648 Senescence LOC100233234 100233234Protein Increase/Decrease SEQ ID NO: 428 SEQ ID NO: 649 Senescence GMSGR100301892 Protein Increase SEQ ID NO: 429 SEQ ID NO: 650 Senescence CIB1100794256 Protein Increase/Decrease SEQ ID NO: 430 SEQ ID NO: 651Senescence LOC100808627 100808627 Protein Increase/Decrease SEQ ID NO:431 SEQ ID NO: 652 Senescence SAN1B 100810801 Protein Increase SEQ IDNO: 432 SEQ ID NO: 653 Senescence GMCRY2B 100811759 ProteinIncrease/Decrease SEQ ID NO: 433 SEQ ID NO: 654 Senescence SAN1C100811875 Protein Increase/Decrease SEQ ID NO: 434 Yield LOC100777102100777102 Protein increase SEQ ID NO: 435 SEQ ID NO: 656

All cited patents and patent publications referred to in thisapplication are incorporated herein by reference in their entirety. Allof the materials and methods disclosed and claimed herein can be madeand used without undue experimentation as instructed by the abovedisclosure and illustrated by the examples. Although the materials andmethods of this invention have been described in terms of embodimentsand illustrative examples, it will be apparent to those of skill in theart that substitutions and variations can be applied to the materialsand methods described herein without departing from the concept, spirit,and scope of the invention. For instance, while the particular examplesprovided illustrate the methods and embodiments described herein using aspecific plant, the principles in these examples are applicable to anyplant of interest; similarly, while the particular examples providedillustrate the methods and embodiments described herein using aparticular sequence-specific nuclease such as Cas9, one of skill in theart would recognize that alternative sequence-specific nucleases (e. g.,CRISPR nucleases other than Cas9, such as CasX, CasY, and Cpf1,zinc-finger nucleases, transcription activator-like effector nucleases,Argonaute proteins, and meganucleases) are useful in variousembodiments. All such similar substitutes and modifications apparent tothose skilled in the art are deemed to be within the spirit, scope, andconcept of the invention as encompassed by the embodiments of theinventions recited herein and the specification and appended claims.

We claim:
 1. A modified soybean cell, wherein the modified soybean cellcomprises two or more targeted modifications, wherein at least one ofthe targeted modifications occurs in a gene selected from the list ofgenes in Table 10 or in a regulatory sequence affecting the expressionof the gene.
 2. The modified soybean cell of claim 1, wherein the geneis associated with a trait, and wherein the trait is selected from thegroup consisting of abiotic stress, architecture, biotic stress,nutrient use efficiency, photosynthesis, resource partitioning, andsenescence, and wherein the modification improves the trait in a cellcomprising the modification relative to a cell lacking the modification,or in a plant grown from a cell comprising the modification relative toa plant lacking the modification.
 3. The modified soybean cell of claim1, wherein the targeted gene, before modification, is at least 60%, atleast 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 91%, at least 92%, at least 93%, at least 94%, atleast 95%, at least 96%, at least 97%, at least 98%, at least 99% or100% identical to a nucleic acid sequence listed in Table
 10. 4. Themodified soybean cell of claim 1, wherein the targeted gene encodes aprotein, and wherein the amino acid sequence of the encoded proteinbefore targeting is at least 90%, at least 91%, at least 92%, at least93%, at least 94%, at least 95%, at least 96%, at least 97%, at least98%, at least 99% or 100% identical to the amino acid sequence of aprotein listed in Table
 10. 5. The modified soybean cell of claim 1,wherein the targeted modifications result in increased expression of thegene or decreased expression of the gene.
 6. The modified soybean cellof claim 1, wherein at least one targeted modification comprises amodification selected from the group consisting of: insertion of anucleotide sequence encoded by a polynucleotide donor molecule; deletionof genomic sequence at a double-strand break in the genome or betweenmultiple double-strand breaks in the genome; and a change in nucleicacid or amino acid sequence.
 7. The modified soybean cell of claim 6,wherein the insertion comprises or creates an upregulatory sequence. 8.The modified soybean cell of claim 6, wherein the insertion comprises orcreates a down-regulatory sequence.
 9. The modified soybean cell ofclaim 1, wherein at least one targeted modification knocks outexpression of the targeted gene.
 10. The modified soybean cell of claim1, wherein at least one targeted modification comprises the insertion orcreation of at least one transcription factor binding site.
 11. Themodified soybean cell of claim 1, wherein at least one targetedmodification is the introduction of a sequence recognizable by aspecific binding agent, and wherein contacting the sequence with thespecific binding agent results in a change of expression of a sequenceof interest.
 12. The modified soybean cell of claim 1, wherein at leastone targeted modification includes the introduction of a nucleotidesequence that encodes an RNA molecule or an amino acid sequence that isrecognizable by a specific binding agent.
 13. The modified soybean cellof claim 1, wherein the modified soybean cell comprises an insertedregulatory sequence selected from the sequences listed in Table
 9. 14.The modified soybean cell of claim 1, wherein the modified soybean cellis an isolated cell or protoplast.
 15. The modified soybean cell ofclaim 1, wherein the modified soybean cell is in a soybean plant, or ina zygotic or somatic embryo, seed, part, or tissue of a soybean plant.16. The modified soybean cell of claim 1, wherein the modified soybeancell is capable of division or differentiation.
 17. The modified soybeancell of claim 1, wherein the modified soybean cell is haploid, diploid,or polyploid.
 18. The modified soybean cell of claim 1, wherein themodified soybean cell is a meristematic cell, embryonic cell, orgermline cell.
 19. The modified soybean cell of claim 1, wherein themodifications are determined relative to a parent plant cell, andwherein the modified plant cell is devoid of mitotically or meioticallygenerated genetic or epigenetic changes relative to the parent plantcell.
 20. A modified soybean plant comprising the modified soybean cellof claim 1, or a progeny plant or progeny seed of the modified soybeanplant, wherein cells of the modified soybean plant, progeny plant, orprogeny seed comprise the targeted modifications.